Taking Just-in-Time Data Structures To New Heights

Printer-friendly version

You should have an affinity for programming language implementation and data structures.


Today's software is so large and complex, it's hard for a developer to choose one particular data structure for the lifetime of the application, and often one data structure is not optimal in terms of performance.  Because applications have different phases -- building up data structures and populating them with data versus searching the data structure and accessing elements, for example -- we believe that the programmer should be relieved of the burden of choosing just one data representation.  The building up phase could benefit from having data laid out in a linked list data structure, whereas accessing elements sequentially from a list performs better if the data is represented by an array.   We propose a concept called Just-in-Time Data Structures (JIT DS) that lets the programmer choose one universal list data structure, while the runtime dynamically changes the underlying representation to give the best performance in terms of memory accesses and execution time.

A PhD student has introduced JIT DS and built a language called JITds to implement them.  A JIT data structure class combines a number of other classes - the potential data representations - and holds the pointer to the current (only one) representation at a particular point in time.  In fact the JIT DS class is a sub-class of all of its different representations, as any of the representations' methods can be called on a JIT DS class instance.  The JIT DS class also defines transition functions, which define how to transition from one representation to another, and swap functions, which define when to transition to another data representation.

Goal & Research Activities: 

The goal of this master's project would be to implement a better version of JITds (the language to implement JIT data structures) based on two existing, but limited, language implementations. A PhD student implemented JITds in both Java and C, but the Java version (a transpiler from JIT DS to Java) suffered from the escaping pointers problem, and the C version was a small prototype.  You would analyze and understand the existing implementations and implement a new fully-fledged JITds in a language of your choice.  Then we could go on to explore new ideas with this implementation, including finding more real-world applications that could take advantage of JIT data structures and doing performance experiments with them.  Or, we could explore various language aspects such as whether the swap statements that trigger a data representation change should be part of the inheritance hierarchy or not.  Another possibility is to incorporate some dynamic analysis into these swap rules in order to avoid excessive overhead of checking whether to swap data structures at runtime.  There are a lot of possibilities to work with this topic, and the student and promoter(s) would come to an agreement about what sounds the most interesting to pursue.  

To get started with this master's thesis, you should read background research on JIT data structures, namely Mattias De Wael's ONWARD paper: http://soft.vub.ac.be/~madewael/jitds/dewael-onward-pp.pdf  and his dissertation (to be provided).