Interprocess Communication with MPS (Message Passing System)
by H. Jack, Student Member, IEEE, W. H. ElMaraghy and T. W. Chien, The University of Western Ontario, London, Ontario, Canada
An asynchronous message passing system has been developed to allow integration of many dissimilar processes (i.e., programs) which may be operating concurrently. The programs may run on a number of different hardware platforms, using a number of different software packages and languages. The system is based on a single server which handles communication between all client programs.
A concept of groups and priorities is used to organize message flow. The groups and priorities concept makes it possible to explicitly define a number of concurrent and parallel execution methods. Thus, the method described in this paper addresses the needs for integrating tasks in a typical computing environment. This structure is robust and will allow any client to sign on or off in an ad-hoc manner with minimal effect on other client programs.
The general approach of this paper is to present some background, and purpose of the research. This is then followed with a description of the MPS approach to distributed systems. Finally a few test applications are discussed briefly, to illustrate the use and testing of the method, including ray tracing and manufacturing workcell control.
The Message Passing System (MPS) is a comprehensive yet easily used communication system for networked computers and is intended to simplify communication between more than two programs. The programs may be resident on many computers, and are not necessarily written for the same languages and operating systems (we used C and UNIX1 machines). The MPS hardware communication structure can be seen in Figure 1 below. The normal method of communication is also shown below.
Figure 1: An Example of Network Based Interprocess Communications
While the traditional method lends itself to speed, it requires that each program have sophisticated functions for communication and an interface to each of the other programs. The MPS scheme allows the sophisticated functions to be present in only the central server (i.e. MPS), and allows the same interface subroutines to be reused for each program.
The MPS system has been developed with a number of idealized objectives. The list below summarizes the objectives.
near real time response, in the range of one second or less (This approach is not intended to compete with real-time systems),
do not limit the system to a single program,
buffering of messages between processes,
general knowledge of other processes, but no specifics, or direct access to remote data,
support major methods of concurrent and distributed processing,
allow insertion of a process into a communication loop for testing, analysis, and other processes,
allows message broadcast to unknown program,
keep running records of transactions for forensics and restarting, in case of failure,
do not try to replace existing distributed communication methods, but allow coexistence,
make system independent of machine type, language used, operating system and the details of the network.
Careful consideration of these functional requirements resulted in a decision to use a central server for the system. This was based on a realization that to incorporate these functions into each program would have been a very difficult task.
The decision to use a central communication server is not ideal. Using a central server builds a bottleneck into the system. It means that only one program may be served at once, and the speed of the server will limit the communication rate. Alternatively, if each program was to have its own interface, the programs could communicate simultaneously, and only be limited by their own speeds, and network transmission delays.
This paper will first attempt to put the work in perspective, and then will develop the MPS model by explaining each feature individually.
As users expect greater performance from computers, the bottleneck of a single instruction stream becomes more critical. Thus, researchers have been examining computing with multiple instruction and data streams [8]. In particular, computing with multiple instruction and data streams has become very popular. The approaches often focus on multi-processor computers for single, large problems, and networked computers for multiple and/or distributable problems. The enormous potential for computing power makes the value of the network techniques obvious, but without the proper programming tools the power is unrealized.
When considering concurrent processes (distributed on two or more computers), there are two main branches. In a tightly coupled system the processes use a technique such as shared memory. In a loosely coupled system message passing may be used. Passing information and requests by messages allows isolation between processes which might have interaction problems, but this often results in more complex hardware for communication. When message passing, the messages may be characterized as [2]:
synchronous (good for real time systems)
Each of these approaches has various merits, but in particular the asynchronous methods make it easy to actively change the connections between two processes.
There are many methods which use asynchronous message passing, but the majority are in language form. The language approach is very powerful but has some inherent problems with the flow of control. The entire concurrent system needs one start point, and is not suited to main programs that compete, such as multiple user interfaces. If a process has failed during execution, it may be difficult to resolve problems and continue execution. The largest advantage to the language based approach is speed, ease of use for specialized tasks, and reliability (excluding faults). There are a number of dedicated languages such as Actors [1] and NIL [10] which are the basis for many modern concurrent programming systems. These languages are limited in terms of available software and compatibility with other programs. Other researches overcome this limitation by augmenting existing languages. For example, Buhr et. al. [4] use additions to the C programming language that allow processes to use shared memory in a UNIX workstation. Once compiled, their program will execute in a concurrent fashion, as defined in the source code. Otto [7] uses C or Fortran as the basis for MetaMP with extra declarations for concurrency, then he compiles declarations the program down to Express. Grimshaw [6] has used C++ to create classes of objects which will perform various functions as program flow requires it. The classes take advantage of a hidden message transport system for concurrent process communications. While these methods are well suited to single problems, or systems, they are not intended for software that has fundamentally different functions. For coarse grained distributed software, we require a structure which is less rigid than languages.
When programming languages are not used, researchers must resort to the use of understood structures and/or server programs for communications. Goyer et. al. [5] describe a system for developing a program distributed over a number of systems. They organize processes into a hierarchical structure that has a service program that deals with communication, synchronization, selection auditing, and failures. Their structure groups processes, and allows process groups to be shared by a number of high level applications. The grouping concept is explored further in this paper, as we develop an internal group structure. In ISIS, Birman [3] also explored the concepts of group structures, using multi-cast messages. His group structures, and group relations provide a good method for fast, single function systems, but this approach requires added sophisticated algorithms in each process for fault tolerance, keeping lists of anonymous clients, state transfer between concurrent clients, etc. Another interesting approach was developed by Singh et. al. [9] in their Frameworks system. They allow a user to write fairly independent tasks, and then fit them together with a graphical tool. Group structures are also used in their method, which treated all processes as concurrent or non-concurrent, and allowed input and output conditions to be defined. A server called ‘FrameWorks’ is then used to control and distribute tasks. It is our opinion that all of these systems provide excellent support, but fail to address the issue of very loosely coupled applications which must communicate. Thus, we introduce an alternate computational model that uses a mixture of concepts found in the references cited.
The MPS representation is based on two concepts; the structure of the process connections, and the addressed destination for the messages. This section will present the structure of process connections in graphical form, and show how various message types will pass through the graph. We consider the cases presented here sufficient to explain most scenarios which MPS could be applied to. A basic example of the MPS method is described first, and then concepts are discussed individually. The reader is encouraged to imagine situations where the methods described would be applicable.
The diagram in Figure 2 contains an example of interprocess communication based on an MPS approach.
Figure 2: An Example of an MPS Graph for CAD Driven Ray Tracing
Each instance (a program connected to the MPS) belongs to a group. Within the group each instance has a priority. For example, Figure 2 shows four groups, the second from the left is called ‘N_RAY’. In the group ‘N_RAY’ there are four instances, one at priority 20, two at priority 0, and one at priority -10. When a message is passed to this group, the priority 20 client will be the first to receive it. This client may pass the message unchanged, or change it, or not pass it at all. If any message comes out of client ‘N_RAY, 20’, then it will be available for pickup by ‘N_RAY,0’. Since two programs have the same identification, they are concurrent, and either one can pick up the message (on a first come first served basis). A message written out by either ‘N_RAY,0’ will then go to ‘N_RAY,-10’. After a message leaves ‘N_RAY,-10’ it will be free to be passed to another group. Each message is given a source and destination group as it travels through the graph (and will always have these, although they may be changed by any client that has a message).
This graph is updated and changed as new instances are initialized and deinitialized. For example, if ‘P_RAY,0’ has not been able to keep up with the demand, the user only needs to start the same program on another computer. There would now be two ‘P_RAY,0’ programs running, and processing. If the bottleneck goes away, either program may be stopped, and the MPS graph would again appear as in Figure 2.
In Figure 3 an example program is shown which initializes itself as ‘N_RAY,0’, finds the reference number for the ‘CAD’ group, and then sends a message to it. The program then deinitializes from the MPS and exits.
Figure 3: An Instance ‘N_RAY,0’ Passes a Message to Group ‘CAD’
In the example in Figure 3 above, the instance is the originator of a new message. In the example in Figure 4 below, the instance ‘N_RAY,-10’ receives, alters and then passes the message along.
Figure 4: An Instance for Message Interception and Replacement of Message from ‘N_RAY,0’ (Note: Sleep() functions are used because such simple programs tend to overrun the MPS server with requests. The MPS has been able to serve from 5 to 50 requests a second depending upon platform and options used)
The reader should note at this point that each message has been set up as a combination of integer and string. Their use is entirely decided by the instances which receive and send them. The reader should also note that this is a sample of a message passing mechanism between groups using only one of the operation modes available. This will be discussed further in the following pages.
Each instance must belong to a group and have a priority. This determines when a message may be picked up by an instance. We encourage the use of a 0 level priority as the target in each group. By this it is implied that all instances with positive priorities act as preprocessors, and all instances with negative priorities act as post-processors. Moreover, the users are encouraged to leave divisions between priority numbers for easy insertion of new instances. This leads to a naturally modular system.
Figure 5: An Example of Instances in a Group (theoretically, priority values should not be limited)
When two (or more) instances are run concurrently, they will receive and process messages as fast as possible. If the sequence of the messages output from the concurrent instances does not have to match the order of message input, then there is no problem. In some cases instances it is necessary to maintain message sequence. Figure 6 illustrates the nature of this problem.
Figure 6: The Problem of Sequence Preservation in Concurrent Processes
The default is for all instances to be considered non-sequential (as sequential instances are more expensive in terms of MPS computation). If an instance (being one of a ‘group, priority’) must be sequential, the user may turn this on as seen in Figure 7.
Figure 7: Example of An Instance Which Has Declared Itself Sequential (and by default, its concurrent neighbors). For this instance to be run concurrently, the user just runs multiple copies of the program.
When a concurrent case is sequential, there are three events to consider. If one instance gets a message, and passes it on (by sending a message using “mb_send_message()”, then the MPS will keep the messages queued until it is their turn to be passed. This is transparent to the user. If the instance decides not to pass the message, then it must release the message so that it does not block the other waiting messages. This is also shown in Figure 7 above. There are three actions which may release a message: writing a message (non-terminal); reading a new message (terminal); releasing the message (terminal). The main reason for having a sequential concurrency is for maintaining the order of messages for programs that are incapable of sorting message order, but require messages to arrive in order. An example of this is a robot which must receive move messages in a correct sequence.
In some cases a message will have to pass directly between two instances. This has been allowed, although it violates the objective that instances be ignorant of each other. As a compromise there have been three modes adopted, which are discussed below.
In the most general mode, any instance can issue messages to every other client on the message board. The same message is issued to every client, and takes priority over messages passed normally between groups. An example of this is seen below in Figure 8. The message is read by each process in a normal manner (although messages directly to an instance are read before messages between groups).
Figure 8a: An Immediate and General Broadcast of a Message to All Clients
Figure 8b: An Immediate and General Broadcast of a Message to All Clients
This approach is somewhat naive, and the messages can be sent to unexpected instances (except for the sending instance).
Another approach is to send the messages to all instances in a group. In this case the message travels normally until it approaches the first instance in the destination group (this mode is not immediate, and thus allows post-processing in the source group). Upon reaching the destination group, a copy is sent to each member of the group. These messages are immediate, and have a priority over messages passed between groups. Figure 9 shows one such example of sending a message of this type.
Figure 9: A Broadcast to All Instances in a Group
This method works well when a group does not contain messages. If a message is between instances in a group when this message arrives, then it may result in a conflict, or the message may be lost. For example, a message to kill all instances would result in the loss of a message which is still active.
As described in the last section, when messages are continually flowing through groups, the general broadcasts may not be desired. An alternative is to force a message to travel normally through the graph, but to be passed to all concurrent instances. Two cases which have been identified in this approach. In the illustration below, the two cases are shown in comparison (see Figure 10). The message is still addressed to a group, except that a message may be either blocking or non-blocking.
Figure 10: An Example of Blocking and Non-blocking Broadcasts to all Instances in a Group (The message cannot be altered by any client in the group)
As seen in Figure 10, blocking broadcasts forces the message to wait until all concurrent instances have received the message before the next lower priority instance may receive it. In both cases the same message is passed to all instances, only the order of reception may vary (by the blocking/non-blocking constraints). Here, higher priority is guaranteed to see at least one copy of the message before lower priority, while “broadcast to all instances” doesn’t guarantee this.
The last direct communication mode is the simplest to implement. It mimics many existing methods. In this mode one instance sends a message directly to another instance. For this to happen, the sending instance must have a reference to the remote instance. This reference is a unique integer assigned to each instance. Each instance is only allowed to access its own instance number. Therefore, the sending instance must first get the destination instance number from the destination instance (Using normal group based communication techniques). While this is somewhat complicated, it ensures that integrity is maintained. (Note: An instance number is randomly assigned to each instance and therefore cannot be assumed to be the same between runs of a program.) Figure 11 below shows two programs which will perform a direct message pass. This is done after one program gets the instance number of the other.
Figure 11 b): Two programs which pass messages directly between each other (not via the MPS graph)
Figure 11 a): Two programs which pass messages directly between each other (not via the MPS graph)
While section 2.3 outlined a number of approaches which address messages to instances, it did not discuss a method for splitting messages to all groups. If a message must be sent to unknown (or all) groups, a broadcast to all groups is ideal. The example in Figure 12 shows an application of this technique.
Figure 12: A General Message Broadcast to All Groups
The reader will note that the message passes normally through all instances as if an individual message has been passed to each group individually.
An instance may decide to limit messages accepted by filtering its input. This ability has been provided by allowing the user to select a list of accepted groups. When an instance has created this list, it will call for messages. If the list is set to ‘ALL_GROUPS’, all messages will be accepted normally. If one or more groups are put on the filter list, messages from all other groups will be held until the ‘ALL_GROUPS’ filter status is set again. While the filter is in use, messages from selected groups will be passed normally, but the other messages will be queued up. When the filter is removed, all the queued messages will be available to be read. The example shown in Figure 13 shows a program which sets a filter, then clears it.
Figure 13: An Instance Which Sets and Clears a Group Based Message Filter
The reader will find this feature useful when attempting to solve the Mutual Exclusion problem. This problem arises when two or more groups share a resource, but they must have exclusive use when they are accessing it. This feature makes it possible to set up a locking mechanism. The robot driver is an excellent example. While one group is using the robot, all other groups should be locked out. Otherwise, if two groups send a message to the robot at the same time, conflicting requests for motion are sure to happen. Therefore, if one group is locked out, then the robot group will continue to receive a continuous stream of requests from the same source.
Previously, all the groups in the MPS graph have been shown as vertically symmetrical; this is not always a desirable thing. The approach chosen to overcome this is to make it possible to encapsulate groups as an instance. For example, Figure 2 shows that groups ‘N_RAY’ and ‘P_RAY’ are encapsulated in group ‘RAY_TRACE’. In this case the effect would be identical to replacing the instance with the group, as illustrated in Figure 14.
Figure 14: Equivalent Representation of an Encapsulated Group
Although it is encapsulated, group ‘A’ is still available for use, making this approach that much more versatile. In Figure 15 below, there is an example of how to declare an encapsulated group, assuming the group exists on the MPS graph.
Figure 15: A Coded Example of Declaring an Encapsulated Group
The various general broadcasts will elicit different responses for encapsulated groups. If this is a general broadcast to members of a group, then the encapsulated group will be considered. In the case of a general broadcast to all instances, the encapsulated group instances will only receive one copy of the message.
The centralization of MPS allows some features not normally available in distributed systems. In particular the user may create a log file of transactions. When the MPS is run, the normal method is to begin logging all transactions. The user may then switch this off, if desired (see Figure 16). At present it only serves as a tool for forensic evaluations of activity. In the future it may be the basis for reconstructing the MPS if failure occurs.
Figure 16: Turning MPS Log Files Off
The MPS also keeps times of transactions. The default is on, and is useful for log file transactions. At present there is no random access to messages, but in the future access to the messages could be provided, including time stamp information.
Figure 17: Turning Off Time Logging for Messages
In some cases there may not be a 0 level client (as in a pre- and post-processing model). If this is the case, there may not be another client to change the destination group of the message, and as a result the message will cycle through the group indefinitely. To counter this, a flag may be set which will discard messages passing priority level 0, when there is no client present at level 0. The example in Figure 18 shows how to disable this feature (the default is on).
Figure 18: Turning Off The Zero Level Priority Message Blocking Feature
Figure 19: The Effect of Turning off the Zero Level Priority Message Blocking Feature
There are a number of functions of MPS which are being planned. These are listed below with some descriptions.
i) Recovering from a log file in the event of an MPS failure (i.e., fault tolerance).
ii) Message time stamps: A message may be stamped with an expiry time, after which it is discarded if not read. The other case is a message which is to be held until a specific time, at which it will be made available for reading.
iii) Getting lists of all groups on the message board (a yellow pages feature).
iv) Interfaces written in other languages, and on other operating systems (the existing code is OSI compatible).
v) Having concurrently running MPS servers (distributed servers). This would help overcome the bottleneck problem.
v) Giving the user access to waiting, and written messages to examine content and time stamps. This feature will require some examination of security issues.
We recognized that point-to-point communications were required for speed between instances with highly structured tasks. Although we support this feature, it will not be discussed here because it is considered separate from the main concepts in this paper, and also because good techniques have been developed by other researchers.
The MPS was created to solve problems related to control of a manufacturing workcell. But, as the software was developed, it became obvious that it had applications to other problems. The control of the workcell had to allow interaction between programs. The controller included a user interface, an NC code generator, and controllers for a Robot and NC Milling machine. When these were originally in a single program there were problems with concurrent operations, modularity, fault tolerance, and switching control modes. The MPS diagram for the workcell is given below in figure 20. This diagram shows that the user interface, robot, and NC milling machine all have their own groups. But, in the NC milling machine group, there is a higher priority client which preprocesses requests for parts into NC code (we made customized keytags). This new implementation is very robust, and will allow a program which has stopped to be restarted, often without effect on the workcell. Even more important is that the conversion from a single program to a distributed program took less than one week of programming and debugging. The reader will also be interested that the new structure allows the devices (robot and NC milling machine) to be replaced with simulations, or to have the simulators run simultaneously, and not require any changes to the other programs. During testing, the programs were run on a distributed set of workstations, which included Suns and Silicon Graphics.
Figure 20: A Workcell Controller Based on MPS
The workcell control application provided some novel advantages over existing techniques, but did not consider time-hard computational problems. Ray tracing was considered as a good test bed for time improvement. A graphical interface was implemented as a front end, for initiating ray tracing and displaying results. A ‘generic’ ray tracing program was created, and would trace picture components upon request. When these ray tracers were run on a number of machines, the message board would enroll them concurrently. The user interface would then request many small picture segments, and each ray tracer would do these on a first-come-first served basis, then send a message back to the user interface which would display the picture part (see Figure 21). This implementation was done in a few hours using existing software which was not concurrent. This system also exhibited features of fault tolerance, and the added benefit of speeds not possible with a single program.
Figure 21: A Concurrent Ray Tracer Based on MPS
Other applications have been performed with equal success, and will be reported at later dates, along with detailed descriptions of the applications described above. The upcoming applications will take advantage of some of the more advanced features described in earlier sections.
This message passing system provides an excellent tool for developing distributed applications. Because of its structure, it may be independent of language, platform, and user interface. The cost of all of the advantages discussed is a decrease in speed of communication. Thus, for applications with time constraints in the order of milliseconds or below, a synchronous system is much more advisable.
The MPS also has an approach which focuses on event flow, instead of data flow. This makes the system more intuitive for the designer and more like a traditional user interface to the programmer. MPS also allows the ad-hoc addition of foreign processes for diagnosis and observation, which is difficult or impossible in the traditional point-to-point systems. As stated before, MPS does not attempt to replace existing distributed processing methods, but instead provide a method which can coexist with them, and allow distributed computing functions not previously available.
Client: Used interchangeably with instance.
Group A group is a collection of instances which perform parts of a function. The order in which each instance operates is determined by their priority. The suggested mode of addressing messages is to a group.
Instance A program which has a connection to the MPS. Each instance is described with a group name and priority number. If a program has more than one connection to the MPS it will represent more than one instance.
Message An integer and string combination passed between instances. The contents of the integer value and the string are left to the instances.
Message Board Used interchangeably with MPS.
MPS Message Passing System: A central server which communicates with each instance, and handles all message passing between instances. The communications between instances is determined by an ad-hoc graph structure set up when the instances and initialized and deinitialized.
Priority A numerical order in which instances will receive a message. If the two or more instances have the same priority in the same group, they become concurrent.
[1] G. Agha and C. Hewitt, “Concurrent Programming Using Actors”, in Object Oriented Concurrent Programming, edited by A. Yonezawa and M. Tokoro, The MIT Press, pp. 37-53, 1987.
[2] G. R. Andrews, Concurrent Programming Principles and Practice, The Benjamin/Cummings Publishing Company, 1991.
[3] K. P. Birman, “The Process Group Approach to Reliable Distributed Computing”, Technical Report, Department of Computer Science, Cornell University, July, 1991.
[4] P. A. Buhr and H. I. MacDonald, “uSystem Annotated Reference Manual”, Technical Report, Department of Computer Science, The University of Waterloo, Waterloo, Ontario, Canada, 1991.
[5] P. Goyer, P. Momtahan and B. Selic, “A Synchronization Service For Locally Distributed Applications”, appeared in distributed processing, edited by M. H. Barton, E. L. Dagless and G. L. Reijns, North-Holland, pp. 3-17, 1991.
[6] A. S. Grimshaw, “An Introduction to Parallel Object-Oriented Programming with Mentat”, Computer Sciences Report No. TR-91-07, University of Virginia, April 4, 1991.
[7] A. Otto, “MetaMP: A Higher Level Abstraction for Message-Passing Programming”, Technical Report, Department of Computer Sciences and Engineering, Oregon Graduate Institute of Sciences and Technology, 1991.
[8] J. Sikorski, “Sifting Through Silicon; Features of Parallel Computing Architectures”, Sunworld, pp. 39-45, April, 1991.
[9] A. Singh, J. Schaeffer and M. Green, “A Template-Based Approach to the Generation of Distributed Applications Using a Network of Workstations”, IEEE Transactions on Parallel and Distributed Systems, Vol. 2, No. 1, pp. 52-67, January, 1991.
[10] R. E. Strom and S. Yemini, “NIL: An integrated language and system for distributed programming”, SIGPLAN Notices 18, pp. 73-82, June, 1983.