JIT Rete

Printer-friendly version
Info
Prerequisites: 

Prior knowledge of production systems and/or the RETE-algorithm are a plus.

Prior knowledge of meta-programming (such as the meta-object protocol) is a plus.

Context: 

The traditional thin-client model is replaced by a model where clients continuously cooperate via the cloud, in order to give the user a rich experience. Well-known examples of such cloud applications are Gmail and Google Docs. The client-side of these cloud applications often runs in the browser while the server-side is deployed in the cloud.

Some collaborative cloud applications serve a massive number of participants who generate huge amounts of data. A simple example of such an application is Twitter. Every time a user sends a tweet to the cloud, the cloud immediately processes this information and notifies an (often small) interested subset of its users (those who are subscribed to the one who tweets) of this new tweet. Twitter is however a dull example because users manually indicate in which users and hash tags they are interested. A more involved example is a dating service that allows its users to upload a profile and (arbitrarily complicated) predicates to constrain potential new contacts. Based on these constraints, the cloud decides which user accounts have a match, and should hence exchange information such as location and contact information with other specific users. These constraints and profiles can, however, continuously change (e.g. a mobile client of this application might always send its location to the server) and this new information needs to be processed in near real-time by the cloud in order to find possible new matches.

This is the type of  cloud-applications we envision in this thesis. To handle such a continuous stream of information, production rule systems are often used.  A well-known implementation of production rule systems is the RETE algorithm [1][2]. This algorithm was designed to detect patterns in large amounts of data. RETE performs well under these circumstances because it incrementally keeps track of intermediate (pattern) matches. The RETE-algorithm, however, was developed in the eighties. Since then, the landscape of how computers are used has changed drastically and the amount of data that sits on a device and that is sent over a network (to the cloud) has increased dramatically. Over the past few decades, numerous optimizations for RETE have been developed to cope with the increasing amount of data that needs to be processed [3][4]. Certain optimizations/implementations expose better performance characteristics when a significant amount data is being removed but expose worse performance characteristics when data is being added to the system. Other optimizations/implementations of the RETE-algorithm perform better under yet other circumstances. 

Goal & Research Activities: 

In this thesis,  the student is expected to create a Just-In-Time RETE; a version of RETE that changes its inner working (at runtime) based on the actual circumstances.  For instance, if it can be expected that a lot of deletions will occur in the near future, the production system should adopt the behavior specified by the TREAT-algorithm (a variation on RETE that that performs well for deletions, at the cost of slightly higher insertion-costs). Alternatively, if a lot of updates to existing data can be expected, the production system should change its evaluation strategy to yet another optimization.

This thesis builds on earlier work, which entails that the student will be given an implementation of the RETE-algorithm and some optimizations. The student can therefor focus on defining transformation strategies in order to switch from one optimization to another based on actual circumstances.

To investigate this, the Master student will have to (at least):

-       Implement a use case such as the dating application.

-       Delve into related work of topics including the RETE algorithm, rule-based reasoning and production systems.

-    Get acquianted with an implementation of the RETE-algorithm.

-    Modify the algorithm in such a way that it can change from one optimization to the other at run-time.

-       Validate his/her work by comparing the performance of his/her implementation of JIT RETE against the version of RETE that does not change its evaluation strategy at run-time.

References: 

[1] http://en.wikipedia.org/wiki/Rete_algorithm

[2] C. Forgy, “Rete: A fast algorithm for the many pattern/many object pattern match problem,” Artificial Intelligences, vol. 19, no. 1, pp. 17– 37, 1982.

[3] Miranker, Daniel P. TREAT: A Better Match Algorithm for AI Production Systems; Long Version. University of Texas at Austin, 1987.

[4] Wright, Ian, and James A. R. Marshall. "The execution kernel of RC++: RETE*, a faster RETE with TREAT as a special case." Int. J. Intell. Games & Simulation 2.1 (2003): 36-48.