On Scoping, Closures, Methods and Messages

This tutorial chapter goes into a bit more detail on the interplay between AmbientTalk’s functional aspects (e.g. block closures, higher-order functions and lexical scoping) and its object-oriented aspects (e.g. objects and delegation). It is also described how methods and messages can be manipulated as first-class objects in their own right.

Lexical Scope vs Object Scope

AmbientTalk distinguishes between two kinds of scopes:

  1. the lexical scope, which is the set of all variables that are lexically visible in the program text. In other words: all variables in an enclosing scope are part of the lexical scope of the enclosed (nested) scope.
  2. the object scope, which is delimited by a chain of delegating objects. When sending a message to an object, the object and its parent objects delimit the scope in which the message is looked up.

The rules for distinguishing which scope to use when resolving an identifier are straightforward:

  1. Unqualified access to a variable, e.g. x, is always resolved in the lexical scope.
  2. Qualified access to a variable, e.g. o.x, is always resolved in the receiver’s object scope.

These rules also hold for method invocation: the invocation f() is resolved lexically: f is looked up in the lexical scope; the invocation o.m() is resolved dynamically, i.e. m is looked up in o. The difference is significant: lexical variable access can be statically determined, while qualified access is subject to late binding (enabling object-oriented polymorphism). As a programmer, you must be aware of the fundamental difference in semantics.

The most important consequence of these rules is that one should think carefully about how an object accesses its own fields or methods. It can now do so in two ways. For example:

def o := object: {
  def x := 5;
  def getStatic() { x };
  def getDynamic() { self.x };
}

In the code snippet above, o defines two accessors for its field x. The getStatic accessor refers to x unqualified. As a result, x is looked up in the lexical scope and found in o. The getDynamic accessor accesses the field by means of a self-send. According to the rules outlined above, x is accessed in a qualified way, which means it is looked up in the object scope of o. Now consider the following code:

def o2 := extend: o with: {
  def x := 6;
}

This program behaves as follows:

>o.getStatic()
>> 5
>o.getDynamic()
>> 5
>o2.getStatic()
>> 5
>o2.getDynamic()
>> 6

As can be derived from the rules defined above, the access to x is early bound: the value of x returned by getStatic is always the value of the lexically visible x variable, in this case the field of o. Qualified access, including self-sends like self.x is resolved in the receiver’s object scope. Hence, when getDynamic is invoked on the o2 object, variable lookup starts in o2 and the overridden field’s value is returned.

For many object-oriented programmers, this distinction between performing m() and self.m() may seem confusing and even error-prone. After all, merely “forgetting” the qualified access disallows child objects to override the invocation to m.

The confusion stems from the fact that many OO languages – like Java – make no distinction between both access forms. For example, in Java, if one writes m() and m is not lexically visible, it will be interpreted as if the programmer had written this.m(). This solution, however, brings its own set of fragility problems to the table. Gilad Bracha provides a good overview of these in his paper on the Interaction of Method Lookup and Scope with Inheritance and Nesting. In terms of the categorization proposed in that paper, AmbientTalk adopts Bracha’s 3rd option in dealing with the interaction between scoping and inheritance.

The distinction between x and self.x, although at first sight potentially confusing, offers many advantages. First, because AmbientTalk has true lexical scoping rules, we feel that any lexically visible variable should be made accessible to the programmer. For example, when nesting objects, the nested object should have access to the fields and methods of the outer object, next to its own fields and methods. It can access both scopes by either qualifiying identifiers or not.

Second, the difference between early and late binding offers objects a very fine-grained control over what field and method accesses may be trapped and overridden by child objects. For example, in the code snippet above, o can rest assured that the access to x in getStatic cannot be overridden by child objects (of course, child objects can still override getStatic). Similarly, all points where late binding takes effect become explicit in the code as self-sends.

The difference between unqualified and qualified slot access is also important to child objects. If a child object wants to invoke an inherited method of its parent object, it must invoke the method by means of a self-send. For example, consider AmbientTalk’s unit testing framework. To write a unit test, an object extends the UnitTest object and defines a number of test methods. The UnitTest object defines a number of methods to write assertions, e.g. assertEquals(o1,o2) which are accessible by the unit test. Consider the following code snippet:

def myTest := extend: UnitTest with: {
  def testSomething() {
    self.assertEquals(1+1,1*2);
  };
}

It is important that assertEquals is invoked by means of a self-send and not as assertEquals(1+1,1*2). The latter would result in an exception because assertEquals is not lexically visible, it is only visible within myTest‘s object scope.

Nesting Objects

AmbientTalk exploits its lexical scoping rules to the fullest extent by allowing as much program elements as possible to be nested. For example, it is possible to lexically nest objects within other objects, or even to nest functions within other functions. In this section, we describe how nested objects interact with the scoping rules presented above.

Facets

One of the most appealing use cases for nesting objects is that it allows a very secure kind of sharing between objects: all objects that are nested within another object have the privilege of all sharing the same lexical scope. This form of sharing is much more secure that sharing data via delegation, because the set of objects sharing the scope is statically fixed. In the E language, such nested objects are called facets.

As an example, the following code snippet defines a cell object into which one may write a value or from which one may read a value:

def cell := object: {
  def contents := nil;
  def reader := object: {
    def read() { contents }
  };
  def writer := object: {
    def write(x) { contents := x }
  };
}

The advantage of having defined reader and writer as nested objects is that they may be handed over to parts of an application such that an application has e.g. read-only or write-only access to the cell. For example:

distrustedFunction(cell.reader);

The distrustedFunction only has a reference to the reader object, allowing it to only invoke the read method. Because reader is lexically nested within the cell, it has privileged access to the cell‘s fields, which it shares only with the writer object. If reader and writer would have shared cell by means of delegation, the distrustedFunction would still be able to access the contents field by means of delegation. Because reader is lexically nested within cell and the lexical scope is inaccessible to regular objects, the cell object remains entirely encapsulated w.r.t. distrustedFunction.

Nesting and Delegation

Thanks to AmbientTalk’s lexical scoping rules, nested objects have full access to the fields and methods of lexically enclosing objects. Note that within a nested object, the self pseudo-variable does refer to the nested object itself, as expected. Also, it is perfectly legal for a lexically nested object to have its own dynamic parent, for example:

def myWindow := extend: Window with: {
  def title := "title";
  def myButton := extend: Button with: {
    // myButton can access lexical scope and
    // its own object scope
  }
}

It can sometimes happen that a nested object wants to invoke a method dynamically (rather than statically) on one of its outer objects. This is possible by aliasing self in the outer object, e.g.:

extend: UnitTest with: {
  def testSomething() {
    def theTest := self;
    def inner := object: {
      def compare(a,b) {
        theTest.assertEquals(a,b);
      }
    };
    inner.compare(1+1,1*2);
  }
}

In the example above, writing self.assertEquals(a,b) would fail because self is then bound to inner which does not delegate to UnitTest. Similarly, writing assertEquals(a,b) would fail because it is not lexically visible to inner, it is only visible in the object scope of theTest.

Methods vs Closures

As mentioned previously, AmbientTalk not only allows objects to be nested within other objects, but also allows functions to be nested within other functions. At this point, it becomes important to distinguish between closures and methods.

In AmbientTalk, a method is a function belonging to an object. Characteristically, a method never has a fixed value for self: this value may change depending on which object received the method invocation.

In AmbientTalk, a closure is a function that closes over all lexically visible variables, including the value of self. A closure is a stand-alone value that is not immediately tied to a particular object.

Methods and closures are defined by means of the same syntax. They are distinguishable only by the context in which they are evaluated. For example:

def o := object: {
  def x := 5;
  def meth(y) {
    def clo(z) { self.x + y + z }
  };
};
> o.meth(2)(1);
>> 8

In the above example, meth is a method of o because it is defined as a function directly within an object clause. clo on the other hand, is a closure because it is nested within another method, so it does not belong to an object directly. This example also shows why methods do not close over self and closures do. A method can be thought of as a closure which is explicitly parameterized with an extra self variable, filled in by the interpreter upon method invocation. A closure is not parameterized with this extra variable, it simply inherits the value of self from its nested method.

Top-level Functions

Top-level functions are actually methods in AmbientTalk. This is because all top-level code is treated as the initialization code of that file’s module object. Hence, it is legal to use self in top-level functions.

Block closures

Block closures, as created by means of the syntax { |args| body } always evaluate to closures and hence always capture the value of self.

External Methods and Fields

It is possible to define methods and fields externally on an object. For example:

def calc := object: {
  def add(x,y) { x+y };
}; 
def offset := 0;
def calc.sub(x,y) { offset + self.add(x,-y) };

In this example sub is added to the already existing calc object. However, what happens to the lexical scope of sub? Is an external definition the same as if sub had been defined directly in the calc object or not? Also, if self is used in an external method, does it refer to calc or to the object performing the definition?

In AmbientTalk, external methods are a mixture of methods and closures: they capture the lexical scope at the time the method is defined (so any lexically free variables are looked up at the site of definition, preserving proper lexical scoping rules). However, they are not full closures because they do not capture the value of self, rather as is appropriate for a method, self is determined by the object that received the method invocation.

There is one more difference between closures and external methods: next to the special treatment of self, super is also treated specially for external methods. Within an externally added method, super will refer to the parent object of the object to which the method is added, not the parent of the object that defined the external method. This means that in effect, an external method captures the lexical scope and the bindings for self and super.

The rationale behind this decision is that self and super are inherently related to the object hierarchy of which a method is a part. Hence, even if a method is defined externally on an object, super should properly refer to the parent object of which the method is a part, rather than the lexically visible value of super.

External methods, while powerful, introduce many subtle issues in the language. Therefore, this feature may disappear from later releases of AmbientTalk. It is therefore wise to stay clear from using this feature. For an overview of the problem introduced by external field or method declarations, see Gilad Bracha’s blog post on monkey patching.

First-class Methods

In AmbientTalk, methods can be accessed as first-class entities. When methods are represented as first-class entities, they are represented as closures whose lexical scope corresponds to the object in which they have been defined. Their hidden self parameter is bound to the object from which the method was “selected”.

Methods become first-class when they are selected from existing objects. Because of the uniform access principle introduced previously, it is not possible to “select” a method m from an object o simply by writing o.m. Recall that this would instead execute the method with zero arguments. So, in order to gain access to a first-class representation of the method, AmbientTalk introduces the selection operator &:

def Math := object: {
  def square(x) { x*x };
};

def squareClo := Math.□
> squareClo(2)
>> 4

Slot selection automatically lifts a method slot into a closure. The closure can subsequently be passed around as a stand-alone value. When the method closure is invoked, the original method is ran with self bound to the object from which the method was selected. Here’s another example:

def Adder := object: {
  def amount := 0;
  def init(amnt) {
    amount := amnt;
  };
  def addTo(x) {
    x + self.amount
  };
};
> [1,2,3].map: Adder.new(1).&addTo
>> [2,3,4]

The addTo method is selected as a closure from a new Adder object whose amount field is initialized to 1. Subsequently, that closure is applied to each element of a table, yielding a new table whose values are incremented by one.

The selection operator is also required when accessing lexically visible methods as first-class closures:

def o := object: {
  def m() { ... };
  def grabTheMethod() { &m };
}

Selection using & adheres to the uniform access principle in the sense that it is possible to select a field from an object, just like it is possible to select a method from an object. When selecting a field from an object, the resulting closure is an accessor for the field, i.e. a nullary closure that upon application returns the field’s value:

def o := object: {
  x := 5;
};
def f := o.&x;
> f
>> <native closure:x>
> o.x := 6; f()
>> 6

In the same vein, one may select a mutator method for a field x by evaluating &x:=. A mutator is a unary closure that assigns its single argument to the field it encapsulates.

First-class Messages

Next to first-class methods, AmbientTalk also supports first-class messages. Messages are naturally represented as objects. They encapsulate a receiver, a selector and a table of actual arguments. AmbientTalk is definitely not the first language to treat messages as first-class citizens this way. For example, Smalltalk equally treats messages as objects.

The main difference with other languages is that AmbientTalk provides an expressive syntax to define first-class messages and to enable the sending of a first-class message to an object. In AmbientTalk, a “receiverless” message send evaluates to a first-class message. For example:

def msg := .add(1,2);
>msg.selector
>><symbol:add>
>msg.arguments
>>[1,2]

This syntax is supported for all of AmbientTalk’s message sending operators. Hence, it is possible to write ^add(1,2) to express a first-class delegation message.

An AmbientTalk message supports the operation sendTo(receiver,sender) which makes the sender object send the message to the given receiver object. Hence, it is possible to write msg.sendTo(calc,self) to send the add message to a calculator object. This is similar to Smalltalk’s obj perform: msg operation. However, AmbientTalk also provides the following shorthand syntax:

calc <+ msg

The <+ operator sends a first-class message object to a receiver object, exactly as if the message had been hard-coded in the program text.

First-class messages are often useful to implement higher-order messaging, which are message sends parameterized with other message sends. For example, if map: were redefined as a higher-order method taking a message instead of a closure as a parameter, one could write [1,2,3].map: .+(1) rather than [1,2,3].map: { |e| e+1 }.

at/tutorial/multiparadigm.txt · Last modified: 2011/06/07 18:29 by tvcutsem
 
 
 
Recent changes RSS feed Creative Commons License Donate Powered by PHP Valid XHTML 1.0 Valid CSS Driven by DokuWiki