User Tools

Site Tools


Sidebar

Jump to
AmbientTalk
CRIME
iScheme

at:tutorial:actors

This is an old revision of the document!


This tutorial is under heavy construction!

Concurrent Programming with Actors

Concurrency is an integral part of the AmbientTalk programming language. Rather than relying on threads and locks to generate and manage concurrency, AmbientTalk embraces actors as a much more object-oriented approach to concurrency. Before diving into the details of concurrency in AmbientTalk, we briefly put the main differences between the actor model and the thread-based model into context.

Threads vs Actors

In traditional programming languages, the control flow of a concurrent program is divided over a number of threads. Each thread operates concurrently and control can switch from one thread to another non-deterministically. If two threads have access to the same data (objects), they might cause erroneous behaviour (so-called race conditions) because of this non-determinacy. Therefore, thread-based programming languages introduce locks (in the form of monitors, semaphores, …) which enable the construction of so-called critical sections, which are pieces of program code in which only one thread can run sequentially at a time.

The advantages of the thread-based model are that the model itself is easy to understand, it is efficiently implementable and it can be used to create very fine-grained synchronization (e.g. multiple readers/one writer). The disadvantages are that the resulting program behaviour is very hard to understand because of implicit context switches, interleaved acquisition/release of locks which may lead to deadlock, etc.

The original actor model is based on a purely functional programming language. Over the years, and with the widespread acceptance of the object-oriented programming paradigm, actors have been merged with stateful objects into so-called active object models.

Generally speaking, an active object is an object that encapsulates its own thread of control. An active object also has a message queue or mailbox from which it processes incoming messages. Each message is processed sequentially. An active object responds to an incoming message by invoking the method corresponding to the message. The method is executed by the active object's own thread. Because of this sequential processing of incoming messages, race conditions cannot occur on the internal state of an active object. Objects communicate with active objects by sending them messages asynchronously: the messages are enqueued in the receiver's message queue, rather than being invoked immediately.

Actors and Far References

In AmbientTalk, concurrency is spawned by creating actors: each actor is an autonomous processor. AmbientTalk's actors are based on the vat model of the E programming language. In AmbientTalk, an actor consists of a message queue (to store incoming messages), a thread of control (to execute the incoming messages) and a number of regular objects that are said to be hosted by the actor.

When an actor is created, it hosts a single object which is said to be the actor's behaviour: it is the “public interface” to the actor. The object that created the new actor gets a reference to this behaviour object, such that it can start sending messages to the new actor. An actor can be created in AmbientTalk as follows:

>def a := actor: {
  def sayHello() {
    system.println("Hello World")
  };
};
>><far ref to:<object:1555668>>

As you can see, actors are created similar to objects. The actor: method, defined in the global lexical scope, takes a closure as its sole argument and uses that closure to initialize the behaviour of the new actor. The creator of the actor immediately receives a so-called far reference to this behaviour object, and from that moment on the creating actor and the created actor run in parallel, each capable of processing incoming messages autonomously.

So what exactly is a far reference to an object? The terminology stems from the E language: it is an object reference that refers to an object hosted by another actor. The main difference between regular object references and far references is that regular references allow direct, synchronous access to an object, while far references disallow such access. This is enforced by the kind of messages that these references can carry, as will be explained below.

Asynchronous Message Sending

AmbientTalk, like E, lexically distinguishes between synchronous method invocation and asynchronous message sending. The former is expressed as o.m() while the latter is expressed as o←m(). Regular object references can carry both kinds of invocations. Synchronous method invocation behaves as in any typical object-oriented language. When an asynchronous message is sent to a local object (“local” meaning “hosted by the same actor”), the message is enqueued in the actor's own message queue and the method invocation will be executed at a later point in time.

Far references, like the reference stored in the variable a above, only carry asynchronous message sends, and as such totally decouple objects hosted by different actors in time: objects can never be blocked waiting for an outstanding remote procedure call, they can only communicate by means of purely asynchronous message passing. This is a key property of AmbientTalk's concurrency model, and it is a crucial property in the context of distributed programming.

Hence, given the example above, the method sayHello can only be invoked as follows given a far reference a:

>a<-sayHello();
>>nil

The above code is simple enough to understand: the sayHello message is asynchronously sent to the object pointed to by a by enqueueing it in a's message queue. The message send itself immediately returns nil: asynchronous sends do not return a value by default.

But what happens when the method to invoke asynchronously has parameters that need to be passed. How does parameter passing work in the context of inter-actor message sending? The rules are simple enough:

  1. Objects and closures are always passed by reference
  2. Native data types like numbers, text, tables, … are always passed by copy

Generally speaking, any object that encapsulates a lexical scope is passed by reference, because passing such an object by copy would entail passing the entire lexical scope by copy - a costly operation. Objects without a lexical scope, such as methods, can be copied without having to recursively copy any scope.

When an object is passed by reference, we mean that the formal parameter of a method will be bound to a far reference to the original object. When it is passed by copy, the formal parameter will be bound to a local copy of the object. For example, consider the following calculator actor:

>def calculator := actor: {
  def add(x,y,customer) {
    customer<-result(x+y)
  };
};
>><far ref to:<object:11600335>>

The add method takes three parameters: two numbers to add, and a so-called customer object which is responsible for consuming the “return value” of the method. Here is how to invoke this method:

>calculator<-add(1,2,object: {
  def result(sum) {
    system.println("sum = " + sum);
  };
};
>>nil

Because of the parameter passing rules described above, the add method will receive copies of the numbers 1 and 2, will add them synchronously, and will send the result asynchronously to the customer object, which was passed by reference, i.e. customer is bound to a far reference. Eventually, the actor that sent the add message will itself receive a result message, and when this message is processed by the anonymous consumer object, the result is printed:

sum = 3
The parameter passing semantics just described lead to a model where the only references that cross actor boundaries are far references. In combination with the message sending semantics described previously, this guarantees that asynchronous messages are the only type of messages that can cross actor boundaries, ensuring that concurrent (and as will be shown later, also distributed) communication is strictly asynchronous. In such a model, deadlocks cannot occur (an actor is never blocked) and race conditions within one single actor can never occur. These properties significantly reduce the complexity of concurrent programs.

Isolates

The parameter passing semantics defined above rule out any possibility for an object to be passed by copy. The reason for this semantics is that objects encapsulate a lexical scope, and parameter passing an object by-copy would require the entire lexical scope to be parameter-[assed as well.

To enable objects to be passed by copy between actors, a special type of objects is introduced. These objects are called isolates because they are isolated from their lexical scope. Continuing our previous example, imagine we want our calculator to work with complex numbers, which are typically objects that one would want to pass by copy. We can define complex numbers as isolate objects as follows:

>def complexNumber := isolate: {
  def re; // assume cartesian coordinates
  def im;
  def init(re,im) {
    self.re := re;
    self.im := im;
  };
  def +(other) {
    self.new(re+other.re, im+other.im);
  };
};
>><object:15603573[<stripe:Isolate>]>

The isolate: primitive is actually syntactic sugar for the creation of an object that is automatically striped with the /.at.stripes.Isolate stripe. Any object that is striped with this stripe is treated as an isolate. If you are a Java programmer, you can best compare this behaviour to having to implement the java.io.Serializable interface to make a class's instances serializable.

An isolate differs from a regular object as follows:

  1. it has no access to its surrounding lexical scope; this means that an isolate only has access to its local fields and methods. An isolate does have access to the global lexical scope of its actor.
  2. it is parameter-passed by-copy rather than by-reference in inter-actor message sends. The copy of the isolate received by the remote actor can only access that actor's global lexical scope, no longer the global scope of its original host.
  3. external method definitions on isolates are disallowed. The reason for this is that external method definitions implicitly carry a lexical scope (the scope of their definition). Hence, if an isolate with external methods has to be copied, those scopes would have to be copied as well. Following the rule that objects encapsulating a lexical scope are pass-by-reference, we chose to disallow external methods on isolates.

Futures

futures language construct

Actor Mirrors

explain: mirror factory, message creation, message sending, install

Nesting Actors

lexical scoping rules for nested actors

at/tutorial/actors.1175424619.txt.gz · Last modified: 2007/04/01 13:04 (external edit)