User Tools

Site Tools


at:tutorial:actors

This is an old revision of the document!


This tutorial is under heavy construction!

Concurrent Programming with Actors

Concurrency is an integral part of the AmbientTalk programming language. Rather than relying on threads and locks to generate and manage concurrency, AmbientTalk embraces actors as a much more object-oriented approach to concurrency. Before diving into the details of concurrency in AmbientTalk, we briefly put the main differences between the actor model and the thread-based model into context.

Threads vs Actors

In traditional programming languages, the control flow of a concurrent program is divided over a number of threads. Each thread operates concurrently and control can switch from one thread to another non-deterministically. If two threads have access to the same data (objects), they might cause erroneous behaviour (so-called race conditions) because of this non-determinacy. Therefore, thread-based programming languages introduce locks (in the form of monitors, semaphores, …) which enable the construction of so-called critical sections, which are pieces of program code in which only one thread can run sequentially at a time.

The advantages of the thread-based model are that the model itself is easy to understand, it is efficiently implementable and it can be used to create very fine-grained synchronization (e.g. multiple readers/one writer). The disadvantages are that the resulting program behaviour is very hard to understand because of implicit context switches, interleaved acquisition/release of locks which may lead to deadlock, etc.

The original actor model is based on a purely functional programming language. Over the years, and with the widespread acceptance of the object-oriented programming paradigm, actors have been merged with stateful objects into so-called active object models.

Generally speaking, an active object is an object that encapsulates its own thread of control. An active object also has a message queue or mailbox from which it processes incoming messages. Each message is processed sequentially. An active object responds to an incoming message by invoking the method corresponding to the message. The method is executed by the active object's own thread. Because of this sequential processing of incoming messages, race conditions cannot occur on the internal state of an active object. Objects communicate with active objects by sending them messages asynchronously: the messages are enqueued in the receiver's message queue, rather than being invoked immediately.

Actors and Far References

In AmbientTalk, concurrency is spawned by creating actors: each actor is an autonomous processor. AmbientTalk's actors are based on the vat model of the E programming language. In AmbientTalk, an actor consists of a message queue (to store incoming messages), a thread of control (to execute the incoming messages) and a number of regular objects that are said to be hosted by the actor.

When an actor is created, it hosts a single object which is said to be the actor's behaviour: it is the “public interface” to the actor. The object that created the new actor gets a reference to this behaviour object, such that it can start sending messages to the new actor. An actor can be created in AmbientTalk as follows:

>def a := actor: {
  def sayHello() {
    system.println("Hello World")
  };
};
>><far ref to:<object:1555668>>

As you can see, actors are created similar to objects. The actor: method, defined in the global lexical scope, takes a closure as its sole argument and uses that closure to initialize the behaviour of the new actor. The creator of the actor immediately receives a so-called far reference to this behaviour object, and from that moment on the creating actor and the created actor run in parallel, each capable of processing incoming messages autonomously.

So what exactly is a far reference to an object? The terminology stems from the E language: it is an object reference that refers to an object hosted by another actor. The main difference between regular object references and far references is that regular references allow direct, synchronous access to an object, while far references disallow such access. This is enforced by the kind of messages that these references can carry, as will be explained below.

Asynchronous Message Sending

AmbientTalk, like E, syntactically distinguishes between synchronous method invocation and asynchronous message sending. The former is expressed as o.m() while the latter is expressed as o←m(). Regular object references can carry both kinds of invocations. Synchronous method invocation behaves as in any typical object-oriented language. When an asynchronous message is sent to a local object (“local” meaning “hosted by the same actor”), the message is enqueued in the actor's own message queue and the method invocation will be executed at a later point in time.

Far references, like the reference stored in the variable a above, only carry asynchronous message sends, and as such totally decouple objects hosted by different actors in time: objects can never be blocked waiting for an outstanding remote procedure call, they can only communicate by means of purely asynchronous message passing. This is a key property of AmbientTalk's concurrency model, and it is a crucial property in the context of distributed programming.

Hence, given the example above, the method sayHello can only be invoked as follows given a far reference a:

>a<-sayHello();
>>nil

The above code is simple enough to understand: the sayHello message is asynchronously sent to the object pointed to by a by enqueueing it in a's message queue. The message send itself immediately returns nil: asynchronous sends do not return a value by default.

But what happens when the method to invoke asynchronously has parameters that need to be passed. How does parameter passing work in the context of inter-actor message sending? The rules are simple enough:

  1. Objects and closures are always passed by far reference
  2. Native data types like numbers, text, tables, … are always passed by copy

Generally speaking, any object that encapsulates a lexical scope is passed by reference, because passing such an object by copy would entail passing the entire lexical scope by copy - a costly operation. Objects without a lexical scope, such as methods, can be copied without having to recursively copy any scope.

When an object is passed by reference, we mean that the formal parameter of a method will be bound to a far reference to the original object. When it is passed by copy, the formal parameter will be bound to a local copy of the object. For example, consider the following calculator actor:

>def calculator := actor: {
  def add(x,y,customer) {
    customer<-result(x+y)
  };
};
>><far ref to:<object:11600335>>

The add method takes three parameters: two numbers to add, and a so-called customer object which is responsible for consuming the “return value” of the method. Here is how to invoke this method:

>calculator<-add(1,2,object: {
  def result(sum) {
    system.println("sum = " + sum);
  };
});
>>nil

Because of the parameter passing rules described above, the add method will receive copies of the numbers 1 and 2, will add them synchronously, and will send the result asynchronously to the customer object, which was passed by reference, i.e. customer is bound to a far reference. Eventually, the actor that sent the add message will itself receive a result message, and when this message is processed by the anonymous consumer object, the result is printed:

sum = 3
The parameter passing semantics just described lead to a model where the only references that cross actor boundaries are far references. In combination with the message sending semantics described previously, this guarantees that asynchronous messages are the only type of messages that can cross actor boundaries, ensuring that concurrent (and as will be shown later, also distributed) communication is strictly asynchronous. In such a model, deadlocks cannot occur (an actor is never blocked) and race conditions within one single actor can never occur. These properties significantly reduce the complexity of concurrent programs.

Isolates

The parameter passing semantics defined above rule out any possibility for an object to be passed by copy. The reason for this semantics is that objects encapsulate a lexical scope, and parameter passing an object by copy would require the entire lexical scope to be parameter-passed as well.

To enable objects to be passed by copy between actors, a special type of objects is introduced. These objects are called isolates because they are isolated from their lexical scope. Continuing our previous example, imagine we want our calculator to work with complex numbers, which are typically objects that one would want to pass by copy. We can define complex numbers as isolate objects as follows:

>def complexNumber := isolate: {
  def re; // assume cartesian coordinates
  def im;
  def init(re,im) {
    self.re := re;
    self.im := im;
  };
  def +(other) {
    self.new(re+other.re, im+other.im);
  };
};
>><object:15603573[<stripe:Isolate>]>

The isolate: primitive is actually syntactic sugar for the creation of an object that is automatically striped with the /.at.stripes.Isolate stripe. Any object that is striped with this stripe is treated as an isolate. If you are a Java programmer, you can best compare this behaviour to having to implement the java.io.Serializable interface to make a class's instances serializable.

An isolate differs from a regular object as follows:

  1. it has no access to its surrounding lexical scope; this means that an isolate only has access to its local fields and methods. An isolate does have access to the global lexical scope of its actor.
  2. it is parameter-passed by-copy rather than by-reference in inter-actor message sends. The copy of the isolate received by the remote actor can only access that actor's global lexical scope, no longer the global scope of its original host.
  3. external method definitions on isolates are disallowed. The reason for this is that external method definitions implicitly carry a lexical scope (the scope of their definition). Hence, if an isolate with external methods has to be copied, those scopes would have to be copied as well. Following the rule that objects encapsulating a lexical scope are pass-by-reference, we chose to disallow external methods on isolates.

Returning to the calculator example, the calculator can now add complex numbers locally and send (a copy of) the resulting complex number back to the customer:

>calculator<-add(
  complexNumber.new(1,1),
  complexNumber.new(2,2),
  object: {
    def result(sum) {
      system.println("sum=("+sum.re+","+sum.im+")");
    };
  });
>>nil
sum=(3,3)
A word of warning: isolates are objects that are copied freely between actors. As a result, they should be objects whose actual object identity is of little importance. Usually, the identity of by-copy objects is determined by the value of some of the object's fields. Therefore, it is good practice to override the == method on isolates to compare isolates based on their semantic identity, rather than on their object identity. For example, equality for complex numbers should be defined as:
def ==(other) {
  (re == other.re).and: { im == other.im }
}

It is important to note that an isolate has no access whatsoever to its encompassing scope. The following code results in an exception:

>def x := 1;
def adder := isolate: {
  def add(n) { x + n };
};
adder.add(3)
>>Undefined variable access: x
origin:
at adder.add(3)

Sometimes it is useful to initialize an isolate with the values of lexically visible variables. In that case, AmbientTalk allows the programmer to specify which lexical variables should be copied into the isolate itself, such that the isolate has its own, local copy of the variable. Lexical variables that need to be copied like this are specified as formal parameters to the closure passed to the isolate: primitive, as follows:

>def x := 1;
def adder := isolate: { |x|
  def add(n) { x + n };
};
adder.add(3)
>>4

Futures

As you may have noticed previously, asynchronous message sends do not return any value (that is, they return nil). Quite often, the developer is required to work around this lack of return values by means of e.g. explicit customer objects, as shown previously. This, however, leads to less expressive, more difficult to understand code, where the control flow quickly becomes implicit.

The Concept

The most well-known language feature to reconcile return values with asynchronous message sends is the notion of a future. Futures are objects that represent return values that may not yet have been computed. Once the asynchronously invoked method has completed, the future is replaced with the actual return value, and objects that referred to the future transparently refer to the return value.

Using futures, it is possible to re-implement the previous example of requesting our calculator actor to add two numbers as follows:

def sum := calculator<-add(1,2);

Enabling futures

Futures are a frequently recurring language feature in concurrent and distributed languages (for example, in ABCL, the actor-based concurrent language). They are also commonly known by the name of promises (this is how they are called in the E language and in Argus). In AmbientTalk, futures are not native to the language. However, because of AmbientTalk's reflective infrastructure, it is possible to build futures on top of the language. The system library shipped with AmbientTalk contains exactly this: a reflective implementation that adds futures to the language kernel. This implementation can be found in the file at/lang/futures.at.

To enable futures, it suffices to import the futures module and to enable it, as follows:

import /.at.lang.futures;
enableFutures(true);

The first statement imports the futures module into the current lexical scope. This enables you as a developer to use some additional language constructs exported by the futures module, as will be explained later. The second statement enables the futures behaviour, causing any asynchronous message send to return a future rather than nil. If false is passed to the call to enableFutures, only messages marked explicitly as a FutureMessage will return a future.

Working with Unresolved Futures

We have yet to describe what objects can do with futures that are unresolved, i.e. what can an object do with a future that does not know its value yet? In many multithreaded languages that introduce the future abstraction, the language dictates that accessing (performing an operation on) an unresolved futures causes the active thread to block; the thread has to wait for the value of the future to be available, before it can carry on. This makes futures a preferred means of synchronising two threads with more freedom than is possible using a simple remote procedure call.

Blocking a thread on a future can be a major source of deadlocks, like any form of blocking, of course. In the actor paradigm where communication between actors should remain strictly asynchronous, this behaviour is obviously unwanted. Furthermore, in a distributed programming context, it is highly undesirable to make one actor block on a future that may have to be resolved by an actor on another machine. It would make the application much more vulnerable because of latency or partial failure, resulting in unresponsive applications at best, and deadlocked applications at worst.

The solution proposed in the E language, and adopted in AmbientTalk, is to disallow direct access to a future by any object. Instead, objects may only send asynchronous messages to a future object. This enables the future to temporarily buffer such messages until its resolved value is known. The net effect of this solution is that futures actually can be “chained”, forming asynchronous pipelines of messages.

Should add example here

When a future eventually becomes resolved with a value, any messages that were accumulated by the future are forwarded asynchronously to the actual return value, such that it appears as if the original object had sent the messages to the actual return value in the first place.

AmbientTalk only allows one method to be synchronously invoked on a future, the == method. A word of warning though: equality on futures is defined as pointer equality, so a future will only be equal to itself. It does not compare the parameter object with its actual value, if it would be resolved.

Working with Resolved Futures

As explained above, it is always correct to use asynchronous message sends to communicate with a future. Sometimes, however, we may want to perform some operation on the return value other than message sending, for example, printing it to the screen. If you print the future directly, you get the following:

def sum := calculator<-add(1,2);
system.println(sum);
>> <unresolved future>

AmbientTalk prints the future to the screen. At a later point in time, printing the future again may result in the following:

>system.println(sum);
>> <resolved future:3>

This time, the future was printed when the return value was computed. But what if we simply want to inform the user of the actual value of sum? In such cases, you need to register an observer with the future, which will be asynchronously notified when the actual value of the future has been computed.

In AmbientTalk, this observer takes the form of a closure which will be applied asynchronously, taking as its only argument the actual value of the future. Registering the observer can be easily done by means of the when:becomes: function, exported by the futures module:

def sumFuture := calculator<-add(1,2);
when: sumFuture becomes: { |sum|
  system.println("The sum is " + sum);
};

The first argument to when:becomes: is the future to observe. The second argument is a closure that takes the actual return value as a formal parameter. If there is a possibility that the asynchronously invoked method can raise an exception, this exception can be caught asynchronously by means of the when:becomes:catch: variant:

def sumFuture := calculator<-add(1,2);
when: sumFuture becomes: { |sum|
  system.println("The sum is " + sum);
} catch: { |exc|
  system.println("Exception: " + exc.message);
};

Or, you can specify a stripe to only catch specific exceptions:

def divFuture := calculator<-divide(a,b);
when: divFuture becomes: { |div|
  system.println("The division is " + div);
} catch: DivisionByZero using: { |exc|
  system.println("Cannot divide "+a+" by zero!");
};

The when:* functions are a very easy mechanism to synchronise on the value of a future without actually making an actor block: remember that all the when:becomes: function does is register the closure with the future. After that, the actor simply continues processing the statement following when:becomes:. Also, even if the future is already resolved at the time the closure observer is registered, the closure is guaranteed to be applied asynchronously. This ensures that the code following a when:becomes: block is guaranteed to be executed before the registered closure itself:

when: sumFuture becomes: { |sum|
  system.println("... and here later.");
};
system.print("Always here first");
>>Always here first... and here later.

Futures and Striped Messages

Explain: o←m()@FutureMessage o←m()@OneWayMessage

Conditional Synchronisation with Futures

explain: explicit futures using makeFuture

Actor Mirrors

explain: mirror factory, message creation, message sending, install

Nesting Actors

lexical scoping rules for nested actors

at/tutorial/actors.1175960107.txt.gz · Last modified: (external edit)