====== On Scoping, Closures, Methods and Messages ======
This tutorial chapter goes into a bit more detail on the interplay between AmbientTalk's functional aspects (e.g. block closures, higher-order functions and lexical scoping) and its object-oriented aspects (e.g. objects and delegation). It is also described how methods and messages can be manipulated as first-class objects in their own right.
===== Lexical Scope vs Object Scope =====
AmbientTalk distinguishes between two kinds of scopes:
- the **lexical scope**, which is the set of all variables that are lexically visible in the program text. In other words: all variables in an enclosing scope are part of the lexical scope of the enclosed (nested) scope.
- the **object scope**, which is delimited by a chain of delegating objects. When sending a message to an object, the object and its parent objects delimit the scope in which the message is looked up.
The rules for distinguishing which scope to use when resolving an identifier are straightforward:
- Unqualified access to a variable, e.g. ''x'', is **always** resolved in the lexical scope.
- Qualified access to a variable, e.g. ''o.x'', is **always** resolved in the receiver's object scope.
These rules also hold for method invocation: the invocation ''f()'' is resolved lexically: ''f'' is looked up in the lexical scope; the invocation ''o.m()'' is resolved dynamically, i.e. ''m'' is looked up in ''o''. The difference is significant: lexical variable access can be statically determined, while qualified access is subject to //late binding// (enabling object-oriented polymorphism). As a programmer, you must be aware of the fundamental difference in semantics.
The most important consequence of these rules is that one should think carefully about how an object accesses its own fields or methods. It can now do so in two ways. For example:
def o := object: {
def x := 5;
def getStatic() { x };
def getDynamic() { self.x };
}
In the code snippet above, ''o'' defines two accessors for its field ''x''. The ''getStatic'' accessor refers to ''x'' unqualified. As a result, ''x'' is looked up in the lexical scope and found in ''o''. The ''getDynamic'' accessor accesses the field by means of a self-send. According to the rules outlined above, ''x'' is accessed in a qualified way, which means it is looked up in the //object scope// of ''o''. Now consider the following code:
def o2 := extend: o with: {
def x := 6;
}
This program behaves as follows:
>o.getStatic()
>> 5
>o.getDynamic()
>> 5
>o2.getStatic()
>> 5
>o2.getDynamic()
>> 6
As can be derived from the rules defined above, the access to ''x'' is early bound: the value of ''x'' returned by ''getStatic'' is always the value of the lexically visible ''x'' variable, in this case the field of ''o''. Qualified access, including self-sends like ''self.x'' is resolved in the receiver's object scope. Hence, when ''getDynamic'' is invoked on the ''o2'' object, variable lookup starts in ''o2'' and the overridden field's value is returned.
For many object-oriented programmers, this distinction between performing ''m()'' and ''self.m()'' may seem confusing and even error-prone. After all, merely "forgetting" the qualified access disallows child objects to override the invocation to ''m''.
The confusion stems from the fact that many OO languages -- like Java -- make no distinction between both access forms. For example, in Java, if one writes ''m()'' and ''m'' is not lexically visible, it will be interpreted as if the programmer had written ''this.m()''. This solution, however, brings its own set of fragility problems to the table. Gilad Bracha provides a good overview of these in his paper [[http://dyla2007.unibe.ch/?download=dyla07-Gilad.pdf|on the Interaction of Method Lookup and Scope with Inheritance and Nesting]]. In terms of the categorization proposed in that paper, AmbientTalk adopts Bracha's 3rd option in dealing with the interaction between scoping and inheritance.
The distinction between ''x'' and ''self.x'', although at first sight potentially confusing, offers many advantages. First, because AmbientTalk has true lexical scoping rules, we feel that any lexically visible variable should be made accessible to the programmer. For example, when nesting objects, the nested object should have access to the fields and methods of the outer object, next to its own fields and methods. It can access both scopes by either qualifiying identifiers or not.
Second, the difference between early and late binding offers objects a very fine-grained control over what field and method accesses may be trapped and overridden by child objects. For example, in the code snippet above, ''o'' can rest assured that the access to ''x'' in ''getStatic'' cannot be overridden by child objects (of course, child objects can still override ''getStatic''). Similarly, all points where late binding takes effect become explicit in the code as self-sends.
The difference between unqualified and qualified slot access is also important to child objects. If a child object wants to invoke an inherited method of its parent object, it must invoke the method by means of a self-send. For example, consider AmbientTalk's unit testing framework. To write a unit test, an object extends the ''UnitTest'' object and defines a number of test methods. The ''UnitTest'' object defines a number of methods to write assertions, e.g. ''assertEquals(o1,o2)'' which are accessible by the unit test. Consider the following code snippet:
def myTest := extend: UnitTest with: {
def testSomething() {
self.assertEquals(1+1,1*2);
};
}
It is important that ''assertEquals'' is invoked by means of a self-send and not as ''assertEquals(1+1,1*2)''. The latter would result in an exception because ''assertEquals'' is not lexically visible, it is only visible within ''myTest'''s object scope.
===== Nesting Objects =====
AmbientTalk exploits its lexical scoping rules to the fullest extent by allowing as much program elements as possible to be nested. For example, it is possible to lexically nest objects within other objects, or even to nest functions within other functions. In this section, we describe how nested objects interact with the scoping rules presented above.
==== Facets ====
One of the most appealing use cases for nesting objects is that it allows a very secure kind of //sharing// between objects: all objects that are nested within another object have the privilege of all sharing the same lexical scope. This form of sharing is much more secure that sharing data via delegation, because the set of objects sharing the scope is statically fixed. In the E language, such nested objects are called [[http://www.erights.org/elib/capability/ode/ode-objects.html|facets]].
As an example, the following code snippet defines a ''cell'' object into which one may write a value or from which one may read a value:
def cell := object: {
def contents := nil;
def reader := object: {
def read() { contents }
};
def writer := object: {
def write(x) { contents := x }
};
}
The advantage of having defined ''reader'' and ''writer'' as nested objects is that they may be handed over to parts of an application such that an application has e.g. read-only or write-only access to the ''cell''. For example:
distrustedFunction(cell.reader);
The ''distrustedFunction'' only has a reference to the ''reader'' object, allowing it to only invoke the ''read'' method. Because ''reader'' is lexically nested within the ''cell'', it has privileged access to the ''cell'''s fields, which it shares //only// with the ''writer'' object. If ''reader'' and ''writer'' would have shared ''cell'' by means of delegation, the ''distrustedFunction'' would still be able to access the ''contents'' field by means of delegation. Because ''reader'' is lexically nested within ''cell'' and the lexical scope is inaccessible to regular objects, the ''cell'' object remains entirely encapsulated w.r.t. ''distrustedFunction''.
==== Nesting and Delegation ====
Thanks to AmbientTalk's lexical scoping rules, nested objects have full access to the fields and methods of lexically enclosing objects. Note that within a nested object, the ''self'' pseudo-variable does refer to the nested object itself, as expected. Also, it is perfectly legal for a lexically nested object to have its own dynamic parent, for example:
def myWindow := extend: Window with: {
def title := "title";
def myButton := extend: Button with: {
// myButton can access lexical scope and
// its own object scope
}
}
It can sometimes happen that a nested object wants to invoke a method dynamically (rather than statically) on one of its outer objects. This is possible by aliasing ''self'' in the outer object, e.g.:
extend: UnitTest with: {
def testSomething() {
def theTest := self;
def inner := object: {
def compare(a,b) {
theTest.assertEquals(a,b);
}
};
inner.compare(1+1,1*2);
}
}
In the example above, writing ''self.assertEquals(a,b)'' would fail because ''self'' is then bound to ''inner'' which does not delegate to ''UnitTest''. Similarly, writing ''assertEquals(a,b)'' would fail because it is not lexically visible to ''inner'', it is only visible in the object scope of ''theTest''.
===== Methods vs Closures =====
As mentioned previously, AmbientTalk not only allows objects to be nested within other objects, but also allows functions to be nested within other functions. At this point, it becomes important to distinguish between //closures// and //methods//.
In AmbientTalk, a //method// is a function belonging to an object. Characteristically, a method never has a fixed value for ''self'': this value may change depending on which object received the method invocation.
In AmbientTalk, a //closure// is a function that closes over all lexically visible variables, //including// the value of ''self''. A closure is a stand-alone value that is not immediately tied to a particular object.
Methods and closures are defined by means of the same syntax. They are distinguishable only by the context in which they are evaluated. For example:
def o := object: {
def x := 5;
def meth(y) {
def clo(z) { self.x + y + z }
};
};
> o.meth(2)(1);
>> 8
In the above example, ''meth'' is a method of ''o'' because it is defined as a function directly within an object clause. ''clo'' on the other hand, is a closure because it is nested within another method, so it does not belong to an object directly. This example also shows why methods do not close over ''self'' and closures do. A method can be thought of as a closure which is explicitly parameterized with an extra ''self'' variable, filled in by the interpreter upon method invocation. A closure is not parameterized with this extra variable, it simply inherits the value of ''self'' from its nested method.
==== Top-level Functions ====
Top-level functions are actually methods in AmbientTalk. This is because all top-level code is treated as the initialization code of that file's module object. Hence, it is legal to use ''self'' in top-level functions.
==== Block closures ====
Block closures, as created by means of the syntax ''{ |args| body }'' always evaluate to closures and hence always capture the value of ''self''.
===== External Methods and Fields =====
It is possible to define methods and fields externally on an object. For example:
def calc := object: {
def add(x,y) { x+y };
};
def offset := 0;
def calc.sub(x,y) { offset + self.add(x,-y) };
In this example ''sub'' is added to the already existing ''calc'' object. However, what happens to the lexical scope of ''sub''? Is an external definition the same as if ''sub'' had been defined directly in the ''calc'' object or not? Also, if ''self'' is used in an external method, does it refer to ''calc'' or to the object performing the definition?
In AmbientTalk, external methods are a mixture of methods and closures: they capture the lexical scope at the time the method is defined (so any lexically free variables are looked up at the site of definition, preserving proper lexical scoping rules). However, they are not full closures because they do not capture the value of ''self'', rather as is appropriate for a method, ''self'' is determined by the object that received the method invocation.
There is one more difference between closures and external methods: next to the special treatment of ''self'', ''super'' is also treated specially for external methods. Within an externally added method, ''super'' will refer to the parent object of the object to which the method is added, not the parent of the object that defined the external method. This means that in effect, an external method captures the lexical scope and the bindings for ''self'' and ''super''.
The rationale behind this decision is that ''self'' and ''super'' are inherently related to the object hierarchy of which a method is a part. Hence, even if a method is defined externally on an object, ''super'' should properly refer to the parent object of which the method is a part, rather than the lexically visible value of ''super''.
External methods, while powerful, introduce many subtle issues in the language. Therefore, this feature may disappear from later releases of AmbientTalk. It is therefore wise to stay clear from using this feature. For an overview of the problem introduced by external field or method declarations, see Gilad Bracha's blog post on [[http://gbracha.blogspot.com/2008/03/monkey-patching.html|monkey patching]].
===== First-class Methods =====
In AmbientTalk, methods can be accessed as first-class entities. When methods are represented as first-class entities, they are represented as closures whose lexical scope corresponds to the object in which they have been defined. Their hidden ''self'' parameter is bound to the object from which the method was "selected".
Methods become first-class when they are selected from existing objects. Because of the [[at:tutorial:objects#uniform_access|uniform access principle]] introduced previously, it is not possible to "select" a method ''m'' from an object ''o'' simply by writing ''o.m''. Recall that this would instead //execute// the method with zero arguments. So, in order to gain access to a first-class representation of the method, AmbientTalk introduces the //selection operator// ''&'':
def Math := object: {
def square(x) { x*x };
};
def squareClo := Math.□
> squareClo(2)
>> 4
Slot selection automatically lifts a method slot into a closure. The closure can subsequently be passed around as a stand-alone value. When the method closure is invoked, the original method is ran with ''self'' bound to the object from which the method was selected. Here's another example:
def Adder := object: {
def amount := 0;
def init(amnt) {
amount := amnt;
};
def addTo(x) {
x + self.amount
};
};
> [1,2,3].map: Adder.new(1).&addTo
>> [2,3,4]
The ''addTo'' method is selected as a closure from a new ''Adder'' object whose ''amount'' field is initialized to 1. Subsequently, that closure is applied to each element of a table, yielding a new table whose values are incremented by one.
The selection operator is also required when accessing lexically visible methods as first-class closures:
def o := object: {
def m() { ... };
def grabTheMethod() { &m };
}
Selection using ''&'' adheres to the uniform access principle in the sense that it is possible to select a field from an object, just like it is possible to select a method from an object. When selecting a field from an object, the resulting closure is an //accessor// for the field, i.e. a nullary closure that upon application returns the field's value:
def o := object: {
x := 5;
};
def f := o.&x;
> f
>>
> o.x := 6; f()
>> 6
In the same vein, one may select a mutator method for a field ''x'' by evaluating ''&x:=''. A mutator is a unary closure that assigns its single argument to the field it encapsulates.
===== First-class Messages =====
Next to first-class methods, AmbientTalk also supports first-class messages. Messages are naturally represented as objects. They encapsulate a receiver, a selector and a table of actual arguments. AmbientTalk is definitely not the first language to treat messages as first-class citizens this way. For example, Smalltalk equally treats messages as objects.
The main difference with other languages is that AmbientTalk provides an expressive syntax to define first-class messages and to enable the sending of a first-class message to an object. In AmbientTalk, a "receiverless" message send evaluates to a first-class message. For example:
def msg := .add(1,2);
>msg.selector
>>
>msg.arguments
>>[1,2]
This syntax is supported for all of AmbientTalk's message sending operators. Hence, it is possible to write ''^add(1,2)'' to express a first-class delegation message.
An AmbientTalk message supports the operation ''sendTo(receiver,sender)'' which makes the ''sender'' object send the message to the given receiver object. Hence, it is possible to write ''msg.sendTo(calc,self)'' to send the ''add'' message to a calculator object. This is similar to Smalltalk's ''obj perform: msg'' operation. However, AmbientTalk also provides the following shorthand syntax:
calc <+ msg
The ''<+'' operator sends a first-class message object to a receiver object, exactly as if the message had been hard-coded in the program text.
First-class messages are often useful to implement [[http://www.metaobject.com/papers/Higher_Order_Messaging_OOPSLA_2005.pdf|higher-order messaging]], which are message sends parameterized with other message sends. For example, if ''map:'' were redefined as a higher-order method taking a message instead of a closure as a parameter, one could write ''[1,2,3].map: .+(1)'' rather than ''[1,2,3].map: { |e| e+1 }''.