Home
General
Staff
Contact
Partners
Alumni
Research
Areas
Projects
Papers
Books
Reports
Awards
Teaching
Lectures
Exams
B.Theses
M.Theses
PhD Theses
Go Abroad
Misc
Talks
Library
Gallery
Links
Search
Webmaster
|
Back to index
7 Distributed Objects System
In this chapter we give an overview of our distributed objects. We first describe independent but necessary building blocks for our distributed objects system: threads support in Section 7.1 and network support in Section 7.2. In Section 7.3 we describe our distributed objects and in Section 7.4 we show their usefulness with the help of a bigger example. Finally, we show some performance measurements in Section 7.5.
7.1 Threads
The ETH Oberon System [WiGu89] is a single user multitasking system. It was not designed to support multi-user operation and pre-emptive multitasking. Still the system allows a user to work on multiple tasks simultaneously, i.e. it supports co-operative multitasking. With co-operative multitasking, task switches are triggered by user actions. A global loop waits for input events and dispatches them to the respective applications. Applications handle the events and return control to the input loop as soon as possible. As long as an application has control, no further user input is accepted and all other applications are suspended. This scheme for multitasking is practical as long as all user-initiated tasks are short. However, for longer tasks such as formatting disks, compiling several modules or accessing a database it is an unsatisfactory solution. Besides blocking further user input, it effectively removes all computing power from other tasks for an indefinite amount of time.
In single-user mode this may be acceptable since the user can normally predict the response times. However, there are situations where more powerful mechanisms are needed. In a network, several users may access the system concurrently asking for information, processor time or other local resources. Remote access is not predictable but may occur at any time. Co-operative multitasking would allow one user to lock out all other users by turning down their requests. A remote user cannot tell whether a request remains unanswered because it was lost, because the receiver is busy or because the receiver is just not reachable. This behaviour is not acceptable in a network. We need a more predictable behaviour. Therefore, we introduce the notion of threads. Threads are scheduled pre-emptively, allowing a system to respond to incoming requests immediately.
Basic Characteristics
We designed the threads subsystem [Hof96a] to be as simple as possible. We refrained from adding features such as priorities or memory protection:
- Memory protection was omitted mainly for two reasons: First, Oberon is a type-safe language, which guarantees that no thread corrupts the memory belonging to other threads. Second, we wanted to extend the Oberon System without invalidating existing interfaces. Since all Oberon tasks run in the same address space, introducing protected memory would have forced us to rewrite large portions of the system.
- Thread priorities, while useful, add to the complexity of a threads package. Since our threads are mainly used for network services, any local timing considerations are quite obsolete, as the network delay and latency overshadow most delays introduced locally.
The threads package adds two abstractions to the Oberon System: Pre-emptively scheduled threads and semaphores for synchronisation of threads. Unfortunately, scheduling is system dependent. Therefore, the module Threads is not portable. Up to now, versions exist for the PowerMacintosh, Windows 95 and NT and for Linux have been implemented.
DEFINITION Threads;
TYPE
Semaphore = RECORD
PROCEDURE (VAR sem: Semaphore) Down;
PROCEDURE (VAR sem: Semaphore) Init (val: INTEGER);
PROCEDURE (VAR sem: Semaphore) TryDown (): BOOLEAN;
PROCEDURE (VAR sem: Semaphore) Up;
END ;
Thread = POINTER TO ThreadDesc;
ThreadDesc = RECORD
END ;
ThreadProc = PROCEDURE;
PROCEDURE ActiveThread (): Thread;
PROCEDURE Kill (thread: Thread);
PROCEDURE LoopThread (): Thread;
PROCEDURE Sleep (milliSeconds: LONGINT);
PROCEDURE Start (thread: Thread; proc: ThreadProc; stackLen: LONGINT);
END Threads.
The functionality of the module Threads is split into two parts. The first part offers operations for thread handling and the second part supports basic synchronisation:
- Thread handling mainly allows a programmer the creation of new threads and to kill existing threads. A thread is created by supplying a thread descriptor, the entry point and the desired stack length. It is started automatically. To have access to the thread, one has to either store a reference to the thread or use the operation ActiveThread to access the current thread. There is no possibility to get access to another thread if its descriptor is unknown.
There is one special thread. It is the thread that starts up the system and handles the user input. This "Oberon thread" that can be obtained by LoopThread cannot be killed.
- The only offered basic synchronisation mechanism is the abstract data type Semaphore. It implements an integer semaphore with the operations Up and Down. Additionally, it supports the operation TryDown that allows one a test for the availability of a semaphore. If it is available a Down operation is applied. If not, TryDown returns FALSE.
One could argue that our threads are too simple. However, for offering network services, this simple package is sufficient.
Modifications to the Oberon System
The Oberon System has an automatic garbage collector that periodically collects all unreferenced memory blocks. In order to do so, it marks all blocks that can be reached from live pointers. Such pointers may reside on the procedure stack and since there is a stack for every thread, the garbage collector has to traverse all of them. Therefore, the garbage collector has to be modified to scan multiple stacks instead of just one.
Whenever a thread is started, its stack is registered for garbage collection. When a thread is killed, its stack is removed. The Oberon module Kernel had to be modified as follows:
DEFINITION Kernel;
.........
PROCEDURE AddStack (beg, end: LONGINT);
PROCEDURE RemoveStack (pos: LONGINT);
END Kernel.
AddStack defines a new stack in the specified memory area and RemoveStack removes the stack containing the address pos.
Threads Example
We show how to implement a producer/consumer system (see Figure 7.1). The producer generates random numbers and writes them into a buffer which can hold at most n numbers. The producer blocks if it finds a full buffer. The consumer reads the numbers from the buffer in first-in-first-out order. If no number is available, the consumer blocks. There are three points for synchronisation: The consumer may only read if there are full slots in the buffer; the producer may only write if there are empty slots in the buffer; the buffer should not be modified by both threads at the same time in order to avoid race conditions.
We implement a ring buffer with an array of n elements and protect it with a semaphore allowing only one access at any time. To avoid buffer overflow or underflow, we introduce two semaphores fullSlots and emptySlots with 0 and n as initial values, respectively.
First we initiate the whole scenario by creating the semaphores and starting the producer and consumer threads:
CONST
n = ...;
VAR
emptySlots, fullSlots, lock: Threads.Semaphore;
consumer, producer: Threads.Thread;
BEGIN
fullSlots.Init(0); (* no full slots *)
emptySlots.Init(n); (* all slots free *)
lock.Init(1);
NEW (producer); Threads.Start (producer, Produce, 2000);
NEW (consumer); Threads.Start (consumer, Consume, 2000)
The two threads (Produce and Consume) are started and produce/consume the numbers, while protecting the buffer with lock and waiting for fullSlots/emptySlots.
The module Network is an abstraction layer for network access. It does not implement a particular network protocol, but offers a general view on networks regardless of their actual topologies and protocols [Hof96a]. It builds a logical layer (hosts connected through channels) on top of the physical layer (sites connected through links). The logical layer offers the following abstractions:
- A Host is a logical computer reachable via Network. Any physical computer (called site) may contain several hosts. Two hosts can connect transparently without knowledge of the underlying protocol and the sites hosting them. As a special case both hosts may reside on the same site.
Note: The terms used in literature for logical and physical computers are inconsistent (site, host, node, location, ...). Therefore, we chose terms which we considered appropriate.
- A Protocol connects hosts for sending and receiving information. Protocols may be added at run-time to support different types of networks (Ethernet, ATM, ...) and network protocols (TCP, AppleTalk, ...) [CoDK94]. Network itself implements only a protocol to connect two local hosts (two hosts on the same site). In order to connect hosts across a host boundary one needs additional protocols, e.g. the protocol implemented in NetTCP connects hosts using a TCP connection.
- A Channel connects two hosts bi-directionally. It achieves this by automatically choosing the appropriate protocol, which handles the network dependent details of the actual connection. The hosts may reside on the same site or on different sites connected through physical links. Channels guarantee error-free data transmission. After an unrecoverable error, the channel state is set accordingly and all channel operations will return immediately (for details see sub-section about failure semantics). In order to re-establish a faulty channel for communication, it has to be re-connected to the other host.
We distinguish between the physical layer of sites connected through links and the abstract layer of hosts connected through channels. Network servers and clients need not know about sites and links, but only about abstract hosts and channels. Only protocol implementations, actually managing the network, know about physical sites and links.
Network offers functionality for four different kinds of clients:
- New network protocols. These protocols manage their connections, register newly reachable hosts and remove old hosts.
- Network servers offering services to other hosts.
- Network clients accessing services on other hosts.
- Hosts broadcasting to all hosts.
Overview
To actually transmit data over a network and to handle incoming data, a programmer has to extend the abstractions offered by module Network. Clients and servers view all hosts as a homogeneous set independent of the actual protocol needed to access them (see Figure 7.2).
Network is built on top of the Oberon system heavily using object-oriented features of Oberon-2. It is platform independent. However, it uses the module Threads which has to be available on the target platform.
In the following sub-sections we examine some details of Network. However, we limit the scope to the parts important for our distributed objects implementation. For further details see [Hof96a].
System Model
The terms used in literature are not consistent. Therefore we explicitly define the terms as used by ourselves:
- Sites are the physical units of distribution. Different sites can fail independently.
- Hosts are the logical units of distribution. Different hosts do not share memory.
- Links are physical communication connections (site to site).
- Channels are logical connections (host to host). A channel connecting two hosts may contain one or several physical links. Channels can be opened and closed at any time and several channels may connect two hosts simultaneously.
- A Server is a passive object registered to a host. Servers handle requests received by their hosts. Each host may have several registered servers. Any request may be handled by at most one of the servers registered on a host.
- A Request is generated either by sending a broadcast or by setting up a new channel. Network chooses the appropriate protocol to pass the request to the destination host.
- Request types define the nature of a request. They are strings with up to nineteen characters and a trailing zero-byte.
- Protocols implement the actual communication over specific networks, e.g. a protocol for communication with TCP on the Internet or a protocol for communication with AppleTalk. They hide all network and protocol specific details.
- Protocol channels constitute each protocol's interface to its clients. Each client-server channel communicates over a corresponding protocol channel (see also Figure 7.5). Protocol channels are extensions of normal channels extended to their respective protocol's needs.
- Each site has a default host. This host is defined automatically when the module Network is loaded on a site. Default hosts are bound to their sites. Therefore, servers registered to the default host are specific to the site they are running on.
We view the system as a set of hosts, which are completely connected (every host is connected with every other host) through reliable FIFO communication channels. Security is not guaranteed by Network. We assume fail stop hosts, which are defined as follows:
- A host either behaves properly or crashes and does nothing, i.e. it does not send any messages after a crash.
- A host's failure can be detected by other hosts.
On each site, Network manages the set of all known hosts located on this site or on other sites. There are three different clients which may scan the host set: network clients, network servers and protocols. The hosts included in the host set are not predefined, but registered dynamically either by an application or by a protocol. Each newly registered host needs its associated protocol, i.e. the protocol that connects the host's site with the local site.
An application may register only local hosts in order to add servers to these hosts. On the other hand, protocols register those hosts that are reachable through itself, i.e. hosts located on other sites. To achieve this, protocols pass a reference to themselves together with every host registered by them.
At each host, several servers may be registered (see Figure 7.3). Each of these servers may accept one or several request types and may choose to communicate via different extensions of Network.Channel. These channels may be of different types (Network.Channel and sub-classes).
When we compare the OSI (Open Systems Interconnection) model with our implementation, module Network corresponds to the transport layer (see Figure 7.4). It offers a uniform view to the session level above, regardless of the actual implementation of the lower four levels.
Communication over networks is inherently unreliable. Four kinds of errors may occur:
- Data may be corrupted through electrical disturbance, noise or collisions. This may be detected with checksums. Every protocol has to cope with this problem. It either implements its own error correction algorithm or reports the error to the application and invalidates the channels.
- Data may be lost. Again the protocol may handle this error itself or report it to the application.
- Data may arrive out-of-order. This is mainly a problem with wide area networks (WAN) where different packets of data may be transmitted over different routes. But it may even exist on local area networks if the receiver requests the retransmission of corrupted data. The protocol should mask out-of-order arrival by delaying received data until all preceding data is delivered to the application or report an error to the application.
- Data may be delayed too long resulting in a time-out at the receiver. This error may even occur if both the sender and the receiver work correctly (e.g. broken link or slow medium). Each protocol has to define an appropriate time-out after which it assumes the channel to be broken and sets the state accordingly.
Hosts communicate with each other through channels which should be extensible to satisfy different requirements of clients and protocols. Since both a client and a protocol may want to extend a channel in different ways, it is split into two parts (a client channel and a protocol channel as shown in Figure 7.5), which are connected in sequence using the Decorator pattern [GHJV95]. The client channel and the protocol channel can be extended individually to achieve a combination of both requirements.
For a more detailed description of system models, fail stop hosts or communication errors refer to [Jalote94, BirRen94].
Connecting two Hosts
In this section, we explain step by step how Network establishes a channel ch from a client on host 'A' to a server on host 'B' (see also Figure 7.6). With each step we show a few lines of source code that demonstrate some further implementation details. For the complete interface definition see [Hof96a].
1. Client allocates a new instance ch of type Network.Channel or of a self-defined extension and tries to open the connection by calling ConnectChannel (ch, 'A', 'B', type, err).
NEW (ch);
err := Network.ConnectChannel (ch, Network.GetHost ("A"), Network.GetHost ("B"), type, err)
2. Network delegates the request to the protocol that handles connections to host 'B' (called to).
PROCEDURE Network.ConnectChannel (ch: Network.Channel; from, to: Network.Host;
type: Network.RequestType);
to.protocol.Connect (from, to, type, ...)
3. Protocol transfers the request over the network to its counterpart.
4. Protocol on 'B' creates a protocol-specific channel pCh for communication with 'A' and calls Network.Request to look for an accepting server.
Network.Request (from, to, type, pCh, error)
5. Network.Request tries to find a server handling the received request type. It asks the registered servers to create a server specific channel ch which will handle the request. It connects this channel ch with the protocol-specific channel pCh (using the Decorator pattern [GHJV95]). Finally the server is started so that it may handle communication over ch.
FOR all servers s of host to DO
ch := s.NewChannel (from, to, type);
IF ch # NIL THEN
connect ch and pCh;
s.Run (ch);
RETURN
END
END
6. Protocol transfers the results of Network.Request back over the network to host 'A'.
7. Protocol on 'A' creates a protocol-specific channel pCh, fills it with the information received from 'B' and returns pCh to Network.
8. Network associates the client-specific channel ch with the protocol-specific channel pCh and returns control back to the client.
Executing these steps, Network connects the client with the server using the correct protocol. Further communications bypass module Network. They are handled completely by the chosen protocol and the channel instances allocated by the client and the server, respectively. Note, that all site-dependent handling is done exclusively within the chosen protocol.
The scheme described above allows subclassing of Network.Channel in two dimensions. First it allows the protocol to use a channel instance of its desired subclass and second it offers the client application the same choice. Both can use this facility to store additional data within their channel objects.
Failure Semantics
Distributed systems introduce many advantages such as information and resource sharing, parallelism or scalability. However, they also create significant new problems, e.g. failure proneness. At first sight, distributed systems seem to be more secure and more stable, because they are distributed over several physical sites. As sites have independent failure modes, a failing site allows the remaining sites to keep running. This is correct but has significant consequences for system designers and programmers. A system running on just one site does not have to cope with site failures (power failure, head crash, ...). Distributed systems, however, have to deal with failing sites in order to allow the remaining sites to go on with their work. One or several failing sites should not halt the whole system. We pay for the advantage of independent failures with a substantial increase in complexity.
There are many different approaches to the above problem. Sometimes a site should be informed of another site's failure as quickly as possible (e.g. failure in a backup system for a flight control system). Sometimes this may be delayed until the failed site is actually used (e.g. failure in a backup system for a data server). The desired method for failure detection depends on the actual task at hand.
We distinguish two types of failures. Failures within a single connection and failures involving several connections depending on each other. For both types, failure semantics define how to react to abnormal conditions. This is relatively easy for the first type of failures, as there is only one entity (connection) which may fail. We have no possibilities to continue our operation. As an example: How does a program react to a broken ftp connection? The program can terminate with an error, can raise an exception or can try to re-establish the state as it was prior to the failure.
The second type of failures is far more difficult to tackle. Several hosts on different sites communicate with each other. Failing sites and/or connections should not hinder the on-going calculation, but let it continue as smoothly as possible. The failing entities should be replaced transparently in order to allow a successful completion of the distributed calculation. If error correction is not possible (too many failures, no backups, network partitioning É), the calculation should not be aborted, but should be brought to a graceful end, which allows the system to later resume the calculation under - hopefully - better circumstances. This section does not deal with such failures, as they are not part of the low-level communication mechanisms, but are highly application dependent. Only the application has all the knowledge needed to choose the appropriate solution.
Failure semantics define how failures are reported. What happens locally if the remote site fails? What happens if the received data is corrupt? What happens when the network link is lost? From the viewpoint of an application it would be desirable to offer the abstraction of unfailing connections. Unfortunately, this is not possible. Certain failures can not be corrected. Only the application knows the appropriate steps to be taken (ultimately by calling upon the user). The 'End-to-End argument' says not to bother to do it at a low level, since error checking must definitely be done by all applications. The application probably knows better what to do. On the other hand, it makes no sense to report failures to the application which could be handled easily within the low-level network system. Every application would have to replicate the code necessary to handle them.
We have to steer a balanced course. Some failures can and should be corrected automatically. Others have to be handled by the application. The borderline chosen is arbitrary and differs largely between systems. Every application has to know this borderline, which is defined by the failure semantics.
Network services are accessed through a programming interface (here the module Network). It offers functionality for opening and maintaining connections, writing and reading data, as well as ways to perceive and handle failure conditions, i.e. the interface is influenced by the failure semantics and vice versa. There are three different ways to notify the application about an abnormal condition:
- Explicit parameters indicate failure conditions. Every operation, which may possibly lead to an error, returns an error indication (visible in the interface). Possible additional error information (e.g. number of bytes actually read) may be passed in additional parameters. Using this technique, the resulting source code tends to be cumbersome to read and to maintain. After every operation the error parameters have to be checked and corrective measures may have to be taken. This tends to be troublesome and ugly. It may even be so troublesome that programmers start to ignore possible failures which leads to erroneous programs.
- Applications ignore failures when they communicate over a channel. If an error occurs, an exception is raised by the failing network operation. The exception gets forwarded to the appropriate exception handler. This effectively centralises the handling of failures. With this scheme, there is no way to accidentally ignore a failure. On the other hand, exceptions have drawbacks as well:
- Not all systems and programming languages support exceptions.
- Exceptions contradict the idea of structured programming. Many instructions may possibly jump into the exception handler. When entering the exception handler, control flow may come from many different instructions ('come from' semantic).
- Special operations return state information. This approach is similar to the first one which introduced explicit parameters. After every operation (here with no additional parameters) the state has to be requested and tested for possible failures. Programmers may omit this, which may lead to erroneous programs. However, together with suitable failure semantics, this approach yields a possible new solution.
We chose the third possibility for our module Network. Explicit parameters lead to inflated and difficult-to-read source code. Oberon-2 does not include an exception mechanism (Although a library-based exception handling mechanism has benn added recently [HoMP97]). Additionally, the flexible manner in which the third alternative handles errors, appealed to us.
We associate a state with each channel. The channel state is set to ok after the channel has been established successfully. The channel state may be obtained by calling the operation channel.Res. Furthermore, we define that all operations on a channel that is in an illegal state (not ok) return immediately and preserve the channel state (null operation). This allows the application to use a channel without checking its state after every single operation. The application checks the state only at so-called checkpoints. The application may adapt its error checking scheme (fine or coarse grained) to its own needs, using all its knowledge of the ongoing computation.
Our approach has two drawbacks. First, the application has to ensure that the channel state is ok before previously received data is actually used. Second, a failure is not handled immediately but handling is delayed until the next checkpoint. This may decrease performance in failure scenarios. However, with today's highly reliable networks, the emphasis should be put on the optimisation of error-free scenarios. The following listing shows an example for delayed error checking while reading from a channel ch:
VAR
count, age: INTEGER;
name1, name2: ARRAY 32 OF CHAR;
ch.Read (count, 2); (* read number of entries *)
ch.Read (name1, 32); ch.Read (name2, 32); ch.Read (age, 2);
WHILE (count > 0) & (ch.Res() = Network.ok) DO (* checkpoint *)
... print name1, name2, age
DEC (count);
ch.Read (name1, 32); ch.Read (name2, 32); ch.Read (age, 2)
END
Example: Implementing a Time Server
We will now show how a client can request the current time from a time server on a host called "Time Server". By accessing the host via a name, we can easily move the server to a different site without invalidating its clients. First, we have to register the host "Time Server" on one of our sites. Then we have to install a server object in this host. This is shown in the following code:
NEW (host);
Network.RegisterHost (host, NIL, "Time Server", err);
IF err = Network.ok THEN
NEW (server);
host.AddServer (server)
END
Now we have to think about the implementation of the time server. It has to override the method NewChannel that checks for the type of requests it accepts. In our case, the server accepts request of type "TIME". For communication, our server does not need any special channel extension, so it uses a channel of type Network.Channel.
PROCEDURE (s: Server) NewChannel (from, to: Host; type: RequestType): Network.Channel;
VAR ch: Network.Channel;
BEGIN
IF type = "TIME" THEN NEW (ch) ELSE ch := NIL END;
RETURN ch
END NewChannel;
After a request has been accepted, Network calls the method Run of the accepting server. Typically Run dispatches a thread handling the communication. Our time server omits this, as it just sends four bytes.
PROCEDURE (s: Server) Run (ch: Network.Channel);
VAR time: LONGINT;
BEGIN
time := Input.Time ();
ch.Write (time, 4);
ch.Close
END Run;
We have now completed the implementation of our time server. A possible client may now open a channel to the server and retrieve the current time. The client needs to know the name of the host running the server ("Time Server"), as well as the accepted request type ("TIME").
PROCEDURE Client;
VAR res: INTEGER; ch: Network.Channel; time: LONGINT; timeHost: Network.Host;
BEGIN
timeHost := Network.ThisHost ("Time Server");
IF timeHost # NIL THEN
NEW (ch);
Network.ConnectChannel (ch, Network.DefaultHost (), timeHost, "TIME", res);
IF res = Network.ok THEN
ch.Read (time, 4);
ch.Close;
... use time
END
END
END Client;
We emphasise that both server and client work with a channel instance allocated by themselves. Both may extend the base channel type Network.Channel in order to adapt it to their needs.
Adding a Protocol
The bare module Network itself offers only access to hosts residing on the same site, i.e. it implements only a local protocol. To actually transport data over a network, one has to implement a suitable protocol (for details see [Hof96a]). Clients of Network will not see any differences. NetTCP is currently the only additional protocol. It offers connections on networks supporting TCP. All channels between two hosts are multiplexed onto one TCP connection. This connection is established in the beginning and reused for all later network requests. All measurements in Section 7.5 were done using the NetTCP protocol.
NetTCP also broadcasts information about reachable hosts. If a site starts the protocol NetTCP, the protocol collects all local hosts and broadcasts this information to allow known sites. As a reply, it receives the local hosts of all these sites. These hosts are then registered as reachable through the protocol NetTCP. Hosts added later are handled similarly.
7.3 Distributed Objects System
In this section, we describe the interface and the usage of our distributed objects package. Its basic functionality is quite similar to remote method invocations as, e.g. Java RMI [RMI]. It builds heavily on other modules, which offer thread support, network access and composable message semantics. This considerably reduces the necessary implementation effort. The actual tasks are reduced to the following:
- Naming Service. To access a remote object, one needs a way to identify it. Every remote object has a unique identity that consists of its host and its name. The naming service is responsible to deliver an object given its name.
- Network Management. Remote objects reside on different hosts. The network management is responsible to open/close the necessary connections and to define the protocol for the transmissions over these connections.
- Suitable Invocation Semantics. Our distributed objects use the composable message semantics framework described in Chapters 4 and 5. Our distributed objects system implements three basic invocation abstractions: LocalInvocation, SyncInvocation and ASyncInvocation.
- Shallow Copied Parameters. To shallow copy parameters, one has to supply the marshalling mechanism with a bi-directional relation from an object to its name. DObjects implements such a relation using the unique names of the distributed objects.
- Distributed Garbage Collection. The Oberon system offers local garbage collection. We have to extend it to a distributed garbage collector that takes remote references into account.
All this functionality is implemented in the module DObjects.
DEFINITION DObjects;
IMPORT SYS:=SYSTEM, Invocations, Network, Linearizers;
CONST
copyable = 1;
migrateable = 2;
ok = 0;
errDouble = 301;
errUnknown = 302;
errAccess = 304;
TYPE
Name = ARRAY 32 OF CHAR;
PROCEDURE ASyncInvocation (): Invocations.Invocation;
PROCEDURE SyncInvocation (): Invocations.Invocation;
PROCEDURE LocalInvocation (): Invocations.Invocation;
PROCEDURE Export (obj: SYS.PTR; h: Network.Host; n: Name; c: Invocations.Class;
flags: SET; VAR res: INTEGER);
PROCEDURE Hide (host: Network.Host; name: Name);
PROCEDURE Import (host: Network.Host; name: Name; VAR obj: SYS.PTR; VAR res: INTEGER);
PROCEDURE Copy (host: Network.Host; name: Name; VAR obj: SYS.PTR; VAR res: INTEGER);
PROCEDURE Original (host: Network.Host; name: Name;
VAR obj: SYS.PTR; VAR res: INTEGER);
PROCEDURE SetMarshaller (modName, typeName: ARRAY OF CHAR;
marshaller: Linearizers.Marshaller);
PROCEDURE State (obj: SYS.PTR): INTEGER;
END DObjects.
Interface Description
Invocation Abstractions
- SyncInvocation returns an instance of the default distributed client-side invocation abstraction. It implements synchronous remote method invocation. It implements best-effort semantic and guarantees no other properties, such as at-most-once.
- ASyncInvocation returns a new instance of the asynchronous invocation abstraction. A remote invocation using this abstraction returns immediately, i.e. the return value and output values of reference parameters have no defined values. More complex features for asynchronous feedback, e.g. futures [Hals85], need additional support which is not defined in this simple abstraction.
- LocalInvocation returns the default server-side invocation abstraction. It is an alias of the default server-side abstraction Objects.DirectInvocation provided by the semantics framework (see Figure 7.8).
Object Operations
- Export (obj, host, name, class, flags, res) exports the local object obj on the local host host under the name name. In order to access obj, other hosts need to know the exporting host and the object's name (to achieve location transparency one needs another name service that can be built on top of this one). Export automatically generates the necessary server-side skeleton snippets by using the supplied meta-class invocation information in class. res contains the appropriate error code if, e.g. the necessary code cannot be generated. The parameter flags defines some properties of the exported object. Currently, one can specify whether the object can be copied or migrated. By default, one can neither copy nor migrate an object. If no meta-class information is supplied (class=NIL), Export automatically assumes the default invocation semantic, i.e. synchronous remote invocation with deep copied parameters.
- Hide (host, name) hides the previously exported object name. Exported objects are thought to be permanent, i.e. to be kept alive, even when they are not used locally anymore. Only after calling Hide are they no longer accessible from remote hosts and may be collected by the garbage collector.
- Import (host, name, obj, res) imports an object with name name from host and returns a reference to the generated stub object in obj. The necessary stub snippets are generated automatically from the class definition and the semantics defined by the corresponding call to Export. Possible errors are returned in res. Our current prototype does not allow additional client-side filters, i.e. the object is imported with exactly the same semantics that were defined when the object was exported.
- Copy (host, name, obj, res) creates a clone of the specified remote object and returns a reference of the clone in obj. The clone and the original object have no connection whatsoever after the clone has been created successfully. res again shows potential error codes. Later invocations on the clone are handled as local method invocations. There are no additional semantics. Copy only succeeds when it is allowed to copy/clone the desired object. This must have been specified by including the flag copyable when the object was exported.
- Original (host, name, obj, res) fetches the specified object and transfers/migrates it to the local host. If the migration is successful, obj references the object. Otherwise, res contains the appropriate error code. The original object on host is deleted, i.e. it is hidden. Clients currently using the object on host continue to work correctly. However, as soon as all these clients have no more valid remote references, the garbage collector removes the object name on host. Original only succeeds when it is allowed to migrate the desired object. This has to be specified when it is exported by setting the flag migrateable.
Miscellaneous
- SetMarshaller (modName, typeName, marshaller) defines the procedure marshaller to be used whenever an object of the type modName.typeName is marshalled. DObjects uses the marshalling mechanism described in Section 4.2, which requires a marshalling procedure for every marshalled type. It is the duty of the programmer to call SetMarshaller for every type before he/she tries to transfer a corresponding instance.
- state := State (obj) returns the state of a stub object. In contrast to local method invocations, it is possible for remote invocations to fail. Therefore, there must be a way to inform the programmer of errors. A better solution would be to use exceptions as, e.g. in Java [GoJS96] (see also Section 7.2 on failure semantics). Unfortunately, Oberon does not support exception handling. There is a prototype implementation that adds exception handling to the Oberon system [HoMP97], but it is only available on the PowerMac platform. Therefore, we refrained from using exceptions and used another approach.
It is the duty of the programmer to actively call State repeatedly to check whether an error occurred or not. To ease this and to reduce the number of necessary calls, we define the erroneous state of an object to be stable, i.e. once an error occurred, the state of the object remains stable. Later invocations do not change the state any more. The run-time system recognises invocations on objects with an illegal state and immediately returns control to the caller. To ensure correct program behaviour, it is the duty of the programmer to always call State before using data returned from a remote invocation.
Naming Service
In order to specify and distinguish objects, they need unique identifiers. The identifiers of our distributed objects consist of two parts. The name of the host where they reside and a name given to them when they are exported. The combination of these two parts has to be unique in order to avoid ambiguities. DObjects assumes that all hosts have different names. This is ensured by the network support Network, which also copes with the details as to how the host is actually contacted. The uniqueness of the object names within one host is ensured by Export.
The implementation of the naming service is quite simple, e.g. there exists no name service caches. Whenever a client accesses an object by giving its name, a connection to the appropriate host is opened and a request for the desired object is sent over this connection.
Network Management
Another task of the module DObjects is the management of the network: opening/closing connections, sending requests, protocol definition and receiving requests.
Every host that exports at least one object starts a server process that handles requests on the exported objects. This server is registered with the appropriate host. The server is a subclass of Network.ServerDesc that implements the methods NewChannel and Run and adds some data fields for the bookkeeping of the objects exported on this server. NewChannel checks if the incoming requests have a correct type (see Figure 7.7).
If the server receives a correct request type, it creates a new communication channel and starts a new thread that handles the request. The thread's lifetime is short for all request types except call. The request is handled, the answer is sent back to the requesting host and the thread is terminated. However, a call request is handled differently. The started thread keeps running in a loop. All further incoming requests on this channel are supposed to also be of type call, i.e. no new thread has to be created in order to handle them and the resolution of object name to object reference obj is not executed anymore.
REPEAT
ch.ReadNumber(methID);
IF methID = -1 THEN ch.Close; RETURN END; (* id=-1 => stub is freed on client *)
ch.ReadStream(s);
s := Objects.Invoke(obj, methID, s); (* search for correct semantic done by Objects *)
ch.WriteStream(s);
ch.Flush
UNTIL ~ch.Connected()
As can be seen in the above listing, DObjects also extends the type Network.Channel by adding new methods for reading/writing streams, strings and numbers.
Predefined Invocation Semantics
DObjects implements three invocation abstractions (see Figure 7.8). These abstraction offer the basic support for remote method invocations. In order to get well known semantics as, such as at-most-once, one has to decorate them with appropriate filters:
- LocalInvocation is an alias for the standard server-side abstraction Objects.DirectInvocation that directly invokes the skeleton.
- SyncInvocation implements synchronous remote method invocation. If no connection to the host of the original object exists, it opens a corresponding network connection. If the connection exists, it transmits the method identifier and the streamed parameters. After receiving the result stream, it returns control to the stub, i.e. the caller. It implements no error correction. If the invocation fails, it sets the state of the stub accordingly and returns an empty stream.
- ASyncInvocation implements asynchronous remote method invocation. It uses the same technique as SyncInvocation, but does not wait for the resulting stream. As soon as the method identifier and parameter stream have been sent, it returns control to the caller. Potential output parameters and the function return value have no defined values. The programmer has the duty to ensure correct behaviour of the program, i.e. she/he should not use the values stored in the output parameters and the function return value.
As mentioned earlier, our semantics framework offers invocation transparency. However, up to now we ignored other kinds of transparency. Location transparency depends directly on the naming service used while importing and exporting an object and therefore is not directly related to our invocation semantics. New filters and abstractions may directly supply failure, replication or security transparency. They are all well suited candidates for extensions of our composable message semantics framework. Other transparencies, e.g. migration, relocation and peristence, cannot be tackled on a per-invocation basis. However, transaction transparency may well be achieved using our composable message semantics framework.
Shallow Copy of Parameters
By default, parameters are copied in deep-copy mode, i.e. a clone is generated on the callee host. However, this behaviour is not always desirable. First, if one may not want to allow clones of an object, and second, it is sometimes useful not to copy a local object over the network, but to just tell the callee where the object can be found. In order to achieve this, DObjects registers itself with the marshalling mechanism by implementing the three required hooks (see Section 5.1):
- IsStub(obj) returns TRUE if the passed object is a stub object. The marshaller uses this procedure to determine whether it may copy obj into a stream or whether it has to use shallow copy. Whenever IsStub returns TRUE, the marshaller uses shallow copy.
- GetID(obj, id, res) returns the unique identifier id for the object obj. DObjects uses identifiers consisting of the object's host and name. Depending on the kind of obj different identifiers are returned:
- If obj is a stub object, its original identifier is returned, which was used for the import.
- If obj is already exported, GetID returns the identifier, which was used for the export.
- If obj is a local object that is not visible to the network, GetID automatically generates a unique name and exports obj on the default host under this name. The resulting identifier is returned to the caller. This feature introduces the notion of automatically exported objects. They are handled similarly to manually exported objects, but have a different life cycle. Objects exported by calling Export live until Hide is called and there are no more active clients locally and remotely. Automatically exported objects live only until they have no more active clients, i.e. they are automatically hidden (see Section 8.5 for details).
- GetObject(id, obj, res) returns the stub object belonging to id. It reuses a stub if it already exists locally. If the object specified by id is a local object, the object itself is returned. Otherwise, GetObject imports the desired object by calling Import.
Distributed Garbage Collector
Our distributed garbage collector builds on the local garbage collector used by the standard Oberon system. This collector uses the Mark and Sweep technique. It includes some hooks, with which one is able to change some of its behaviour. Using this support, we were able to implement the distributed garbage collector without modifying the local collector. The garbage collection runs completely transparently in the background, i.e. it is not visible to the application programmer (for implementation details see Section 8.6).
7.4 Putting Composable Message Semantics to Work
The aim of this section is to show the usefulness of different invocation semantics using our distributed objects implementation. We do this using a simple bank account example.
Overview
We assume that bank accounts are accessible to valid users. In a first step, the set of valid users includes the account holder as well as the bank's own branch offices. We will discuss a simple evolution later in this section. An account offers three actions (methods) to its users: (1) add to deposit money; (2) sub to withdraw money; and (3) get to query the current balance.
In order to ensure correct handling of transactions, we add server-side synchronisation to the methods add and sub (see Figure 7.9). As they may not be executed in parallel, we use the same synchronisation semantic instance for both methods. Our two clients both have the same server-side semantic. However, the account holder decorates the get method with a logging filter in order to be able to determine the amount of money he had at a given point in time.
The bank that exports the account has to set the semantics accordingly. We assume that, apart from the above mentioned semantics, a simple default behaviour is used (synchronous remote execution).
VAR
c: Invocations.ClassInfo; (* holds account's meta information *)
si: Invocations.Invocation; (* our special server-side semantic *)
m: Invocations.Method; (* method descriptor *)
account: Account; (* account to be exported *)
BEGIN
É
si := SynchronizedInvocation(Invocations.DirectInvocation()); (* create server-side semantic *)
c := Invocations.GetClass(account); (* get default meta information *)
m := c.GetMethod("add");
m.SetCalleeInvocation(si); (* set semantic for method add *)
m := c.GetMethod("sub");
m.SetCalleeInvocation(si); (* set semantic for method sub *)
DObjects.Export(account, host, "HolderName", c, res); (* export with chosen semantics *)
The clients import the desired account object. The branch offices (bank) import the account directly without further decoration changes.
DObjects.Import(host, "HolderName", NIL, account, res); (* import in branch office *)
... account.get .... account.put ...
Decorating the Semantics
The account holder decorates the get method with a logging semantic in order to later review the changes. Therefore, he/she adds a logging filter to the client-side semantic. Clients are only able to add further filters to the semantics that were defined during export operation. We set the logging filter to the semantic of method get. This filter is automatically combined with the actual semantic received from the server. This is done by our run-time system during the import operation.
VAR
c: Invocations.ClassInfo
ci: Invocations.Invocation;
m: Invocations.Method;
dummy, account: Account;
BEGIN
É
ci := LoggingInvocation(NIL); (* additional client-side semantic *)
c := Invocations.GetClass(dummy); (* get meta information *)
m .= c.GetMethod("get");
m.SetCallerInvocation(ci); (* add logging filter to method get *)
DObjects.Import(host, "HolderName", c, account, res) (* import with additional filters *)
É account.get É account.put É
Varying the Views
Account holder and branch offices use the same server-side semantic, i.e. the same "view" of the account. However, this is not mandatory. Assume that we want to add a third kind of client: the tax office. Of course, such clients should not be allowed to directly withdraw money. Therefore, we deny them access to the methods add and sub, i.e. we introduce another set of server-side semantics (see Figure 7.10).
This gives us a second view on the same account. Whenever someone uses the name "HolderNameTax" to import the account, she/he will only be able to invoke get. To achieve this, we need to export the account objects a second time with different meta information. The existing code does not have to be changed.
VAR
c: Invocations.ClassInfo; (* holds meta information for the account *)
si: Invocations.Invocation; (* our special server-side invocation semantic *)
m: Invocations.Method; (* method descriptor *)
account: Account; (* account to be exported *)
BEGIN
É
si := DenyInvocation(Invocations.DirectInvocation()); (* create server-side semantic *)
c := Invocations.GetClass(account); (* get default meta information *)
m := c.GetMethod("add");
m.SetCalleeInvocation(si); (* set semantic for method add *)
m := c.GetMethod("sub");
m.SetCalleeInvocation(si); (* set semantic for method sub *)
DObjects.Export(account, host, "HolderNameTax", c, res); (* export with chosen semantics *)
It is important to note that the code needed above to generate a new view can be parameterised. Hence, one can imagine a library that adds new views to existing accounts without having to change any code.
7.5 Performance Measures
In this section we give some performance measurements that we took with our Oberon prototype. Although it has not been optimised for performance, our prototype enables us to draw interesting conclusions about the cost of our flexible approach compared to an ad hoc approach which always uses synchronous remote invocation.
Test Environment
For our test environment, we used a Pentium 200 MHz computer running Oberon on Windows NT Version 4.0 (Build: 1381: Service Pack 3). To measure time, we made use of a special hardware register of the Intel architecture which counts the number of cpu-cycles since the most recent reset. This allows for extremely accurate timing. To cope with differences introduced by the garbage collector, we ran all measurements three times in sequence with 100, 1,000 and 10,000 invocations. We also repeated all measurements ten times omitting the fastest and the slowest results. To make our measurements independent of the installed network and its current load, we used the local TCP loop-back. In other words, client and server were running on the same machine and communicated via a TCP loop-back site (127.0.0.1).
Our marshalling mechanism is neither time nor memory optimised, i.e. it allocates a considerable amount of memory even if no data is marshalled (no parameters). This results in several runs of the garbage collector during the measurement. If not noted differently, we included the time spent collecting garbage in the measured intervals.
Our measurements use different scenarios to compare our flexible approach with an ad hoc approach. This ad hoc approach is a modified version of our flexible approach. It is modified in two ways. First, all parts allowing for arbitrary semantics have been removed, i.e. the ad hoc version always uses synchronous remote invocation. Second, we optimised the ad hoc version wherever a simpler solution was possible by the introduction of a fixed invocation semantic. The main differences between the flexible and the ad hoc approach can be seen in Figure 7.11.
As one can see, on both the server and client side, additional work is done in order to support arbitrary message semantics. The most expensive operation is the server-side look-up for the correct server-side semantic; This look-up is included in our measurements. However, this look-up could be avoided by supplying the invocation with further information (as discussed in Section 5.2). Another approach to decrease the time spent for the look-up would be to use a more efficient data structure, e.g. a hash table.
TYPE
Object* = POINTER TO ObjectDesc;
ObjectDesc* = RECORD
ctr: LONGINT
END;
PROCEDURE (o: Object) Next* ;
END Next;
We used the above class definition as our test application. This definition declares a class Object with a method Next that has an empty body. Depending on the test, we add parameters and/or a return value to the interface of Next. The server simply allocates an instance o of this class and exports it on the default host under the name "TestObject".
NEW(o); o.ctr := 0;
DObjects.Export (o, Network.DefaultHost(), "TestObject", NIL, err);
In the standard test case we use no invocation meta information (NIL), i.e. we use synchronous remote invocation. For special tests, we used customised meta information as necessary.
DObjects.Import (Network.ThisHost("TestServer"), "TestObject", o, err);
o.Next;
(* start measurement *)
FOR i := 1 TO n DO
o.Next
END;
(* stop measurement *)
The client mainly consists of a loop that repeatedly calls the method Next. The loop repeats n times depending on the current test. Our implementation needs some additional time to build data structures whenever an object receives its first invocation. Therefore we call Next once before we start the actual measurement.
Measurements
On our test configuration, it takes about 990 microseconds to echo a TCP packet containing one byte of user data from user space to user space. This includes the overhead introduced by our generic network interface as well as the additional TCP layer introduced by the Oberon environment.
To compare the dependency of the invocation duration on the amount of passed parameters, we measured three times with different parameter sizes. We performed the measurements with the following interfaces:
PROCEDURE (o: Object) Next* ;
PROCEDURE (o: Object) Next (i: LONGINT) : LONGINT;
PROCEDURE (o: Object) Next (VAR buf: ARRAY 1024 OF CHAR);
In every case, the same amount of data is transferred back and forth over the network (0, 4 or 1024 bytes), i.e. the size of the output values is equal to the size of the input values. The results shown below are in milliseconds per invocation (see Figure 7.12).
Next we tried to determine the additional overhead introduced by other invocation semantics. To determine this we decorated the invocation of the method Next with either a server-side and/or a client-side filter (see Figure 7.13). Both filters have an empty body. We took the first version of Next without parameters. These measurements were, of course, only done on the version with arbitrary invocation semantics.
Finally, we split up the invocation time to determine where exactly the time is spent (see Figure 7.14). These measurements were only done on the version with arbitrary invocation semantics. We split the time into five parts: transport, server-side semantic, client-side semantic, stub, skeleton and compared these in percentages.
Discussion
These figures (see Figures 7.12, 7.13, 7.14) show the extremely small overhead introduced by our arbitrary message semantics. The parameter size does not influence the speed penalty of using arbitrary semantics (see Figure 7.12). An invocation with no parameters takes about two milliseconds, regardless of whether fixed or arbitrary semantics are used. Introducing parameters does not change this behaviour. Only the overall performance degrades depending on the actual parameter size. An interesting behaviour can be seen when the parameter size is relatively large (1024). The performance is worse if we repeat the invocation more often (see Figure 7.13). This is probably a consequence of the garbage collector which has to run more often. However, to derive conclusions is difficult, as the timing behaviour is heavily influenced by the packet mechanism of the underlying TCP protocol.
The second measurement was made to show the performance loss by introducing invocation filters (see Figure 7.13). It shows that it is quite cheap to add another invocation filter. This was to be expected, as an empty filter introduces nothing but an additional method invocation. The measured differences are actually almost too small to get meaningful results.
Finally, we split the invocations into five parts: transport, server-side semantic, client-side semantic, stub code and skeleton code. We compared the different parts as to how many percents of the whole invocation are spent on executing them. The results show that server and client semantics never use more than half a percent of the time spent for the invocation (see Figure 7.14). This percentage is extremely small. The main part of the invocation time is spent in transit from the caller to the callee and vice versa.
Optimisation efforts will find many things to improve. Network transport is definitely a candidate for optimisations. The marshalling process can also be further optimised. But, no matter how well one optimises these parts, the mechanism of arbitrary message semantics is never the real bottleneck for remote method invocations.
Back to index
|