Template Terms

Template terms are a unique feature of SOUL that is designed to facilitate detecting software patterns. Without this feature, a patterns's characteristics can only be specified through a logic query that quantifies over a reified program representation. As illustrated by the motivating example, such queries tend to become convoluted and require expert knowledge to understand —especially when the pattern's implementation variants need to be recalled. Template terms enable exemplifying a pattern's characteristics through a code excerpt, thus hiding the details of the program representation and its reification:

if jtStatement(?st){ ?x = (?type) ?e; }
The above query contains a template term for a Java statement. It consists of a functor \texttt{jtStatement}, a single argument \SoulVar{?st} and a code excerpt delimited by braces. \emph{The functor of the template term identifies the grammar rule adhered to by the code excerpt.}\footnote{The prefix \texttt{jt} of the functor discerns template terms for Java statements from those for Smalltalk statements (which start with the \texttt{st} prefix).} This grammar describes the concrete syntax of Java ---extended with logic variables and a minimum of non-native syntax. The above excerpt exemplifies an expression statement (i.e., a statement that wraps an expression). The expression assigns, to a left hand side \SoulVar{?x}, the result of a cast to a type \SoulVar{?type} of an expression \SoulVar{?e}. Within a code excerpt, logic variables stand for productions that originate from a non-terminal in the Java grammar. They indicate explicit points of variation among pattern instances. 

The above query contains a template term for a Java statement. It consists of a functor jtStatement, a single argument ?st and a code excerpt delimited by braces. The functor of the template term identifies the grammar rule adhered to by the code excerpt. This grammar describes the concrete syntax of Java —extended with logic variables and a minimum of non-native syntax. The above excerpt exemplifies an expression statement (i.e., a statement that wraps an expression). The expression assigns, to a left hand side ?x, the result of a cast to a type ?type of an expression ?e. Within a code excerpt, logic variables stand for productions that originate from a non-terminal in the Java grammar. They indicate explicit points of variation among pattern instances. 

Template terms can be used anywhere a regular logic term is allowed. Used as a condition in a logic query or rule, a template term succeeds if there is an AST node from the program under investigation that matches the code excerpt. Matching AST nodes exhibit the characteristics exemplified by the source code excerpt of the template term. Backtracking over the term (e.g., when more solutions to a query are requested) successively unifies each matching node with the argument of the term. Variables within the excerpt get bound as well. For a base program with statements "a=(Integer)b;'' and  "y=(Temp)getTemp();", solutions to the above query therefore comprise one with variable bindings [?st→a=(Integer)b;, ?x→a,  ?type→Integer, ?e→b] and one with variable bindings [?st→y=(Temp)getTemp();, ?x→y, ?type→Temp, ?e→getTemp()].

Motivating Example Revisited

Consider the coding convention from the motivating example again. It requires Component subclasses to define an acceptVisitor method that prints a message before double dispatching to its parameter. The template term in the following query exemplifies the characteristics of a class that complies with this convention:

if jtClassDeclaration(?class){
     class !Composite extends* Component {
       ?modList ?type acceptVisitor(?t ?v) {
         System.out.println(?string);
         ?v.?visitMethod(this);
       }
     }
   }

The above query is, in contrast to the original one, concise and descriptive. Its template term closely resembles the prototypical implementation of a complying class. Apart from logic variables, only a negation operator (i.e., !Composite to exclude classes named Composite) and a reflexive transitive closure operator (i.e., extends* to include classes that extend Component indirectly) had to be added.

The following screenshot lists the classes from the motivating example that match the template term. They comply with the coding convention, although they differ from the prototypical complying class: 


 

In class SuperLogLeaf, for instance, the message is printed through a super invocation. To detect such implementation variants, template terms are matched according to multiple strategies that vary in leniency. The most lenient one, for instance, only requires matches for the template term to exhibit all exemplified control flow characteristics. There should be a path (i.e., existentially qualified) through the control flow graph of method acceptVisitor on which the double dispatching and logging instruction occur in the exemplified order. 

The detected instances are ranked by the extent to which they exhibit the exemplified characteristics —thus facilitating their assessment. Class MustAliasLeaf is, for instance, ranked higher than class MayAliasLeaf. In MustAliasLeaf, the double dispatching occurs in all executions of the program. In MayAliasLeaf, in contrast, the required double dispatching only occurs when the user inputs an even number. SOUL ranks its results through a quantified variant of resolution that is based on fuzzy logic theory. 

Both MustAliasLeaf and MayAliasLeaf were only detected because SOUL employs a domain-specific unification procedure under which multiple occurrences of the same variable express a data flow characteristic: the parameter of method acceptVisitor and the receiver of message ?visitMethod should evaluate to the same object at run-time. The unification procedure consults static analyses to determine whether this is actually the case.