Transactional Tasks: Artifacts

Author
Janwillem Swalens (Software Languages Lab, Vrije Universiteit Brussel)
Version
2016-05-04
Latest version at
http://soft.vub.ac.be/~jswalens/ecoop-2016-artifact

Abstract

Transactional tasks are a mechanism to introduce parallelism in a transaction. We extended Clojure with support for transactional tasks. In our fork of Clojure, futures can be created in a transaction, with access to the encapsulating transaction's state.

These artifacts accompany the paper "Transactional Tasks: Parallelism in Software Transactions". Paper abstract:

Many programming languages, such as Clojure, Scala, and Haskell, support different concurrency models. In practice these models are often combined, however the semantics of the combinations are not always well-defined. In this paper, we study the combination of futures and Software Transactional Memory. Currently, futures created within a transaction cannot access the transactional state safely, violating the serializability of the transactions and leading to unexpected behavior.

We define transactional tasks: a construct that allows futures to be created in transactions. Transactional tasks allow the parallelism in a transaction to be exploited, while providing safe access to the state of their encapsulating transaction. We show that transactional tasks have several useful properties: they are coordinated, they maintain serializability, and they do not introduce non-determinism. As such, transactional tasks combine futures and Software Transactional Memory, allowing the potential parallelism of a program to be fully exploited, while preserving the properties of the separate models where possible.

As part of the paper, we created three artifacts, described on this page:

  1. ClojureTxTk: a fork of Clojure with support for transactional tasks.
  2. LabyrinthTxTk and BayesTxTk: two applications from the STAMP benchmark suite, extended to use transactional tasks, written in ClojureTxTk.
  3. TxTkRedex: a machine-executable implementation of the operational semantics in PLT Redex.

Package overview

This package contains the following files, discussed on this page:

ClojureTxTk

ClojureTxTk is a fork of Clojure 1.6.0, extending Clojure with support for transactional tasks. You can get:

  1. The JAR (version 2.3, paper version), to run ClojureTxTk programs
  2. The source code (version 2.3, paper version), to compile the JAR manually
  3. The source code on GitHub (branch transactional-futures), to compile the JAR manually with the latest version

The implementation of transactional tasks is discussed in Section 6 of the paper. Some simplifications were made in the paper. The classes Transaction and TxTask in Listing 5 of the paper correspond to the files LockingTransaction.java and TransactionalFuture.java in src/jvm/clojure/lang.

How to build (optional)

You can either download the compiled JAR directly above (recommended), or download the source code (clojure-1.6.0-txtk-2.3-src.zip) and build the JAR yourself, using the following command:

$ mvn package

This requires Apache Maven. On the first run, Maven will download all necessary dependencies. This process should end in BUILD SUCCESS, generating a JAR that is stored in target/clojure-1.6.0.jar. This JAR is equivalent to the file clojure-1.6.0-txtk-2.3.jar included in this package. The command might take several minutes to complete, as it runs an extensive test suite, comprised of the test suite included in Clojure extended with additional tests for transactional tasks.

How to run

To start the REPL of ClojureTxTk, run:

$ java -cp clojure-1.6.0-txtk-2.3.jar clojure.main
Clojure 1.6.0
user=>

Where clojure-1.6.0-txtk-2.3.jar refers to the file downloaded or generated in the previous step.

To run a ClojureTxTk program, for instance the file hello-world.clj, use the command:

$ java -cp clojure-1.6.0-txtk-2.3.jar clojure.main hello-world.clj
Hello World

A simple example program using transactional tasks is included in hello-transactional-tasks.clj:

(def a (ref 0))

(dosync
  (let [f (future (ref-set a 1))
        g (future (ref-set a 2))]
    (println "a starts 0:" (deref a))
    @f
    (println "a is now 1:" (deref a))
    @g
    (println "a is now 2:" (deref a))))

(println "a is 2 in the end:" (deref a))

Note: the artifact uses Clojure's syntax (e.g. future, @, dosync), while in the paper we use a language-independent syntax (e.g. fork, join, atomic). These syntactical differences are discussed in Appendix A.1 of the paper.

This program will always produce the same output, due to the in-transaction determinacy of transactional tasks:

$ java -cp clojure-1.6.0-txtk-2.3.jar clojure.main hello-transactional-tasks.clj
a starts 0: 0
a is now 1: 1
a is now 2: 2
a is 2 in the end: 2

For comparison, we have also included the original Clojure 1.6.0, downloaded from the Clojure website, in clojure-1.6.0-original.jar. Running the example program with the original Clojure leads to an error:

$ java -cp clojure-1.6.0-original.jar clojure.main hello-transactional-tasks.clj
a starts 0: 0
Exception in thread "main" java.util.concurrent.ExecutionException:
java.lang.IllegalStateException: No transaction running
[...]

By modifying the program to encapsulate each ref-set in another transaction ((future …) to (future (dosync …))), the error is avoided, but the end result may be 1 or 2. Furthermore, the futures may be executed multiple times. This modification does not influence the result when using ClojureTxTk: it is always 2.

Note: print statements in a transaction might be re-executed, as these are not undone when the transaction is rolled back.

Below we will run the Labyrinth and Bayes applications.

LabyrinthTxTk and BayesTxTk

We ported two applications from the STAMP benchmark suite, Labyrinth and Bayes, from C++ to Clojure. Next, we added transactional tasks where they were suitable to increase the potential parallelism of the applications.

  1. Original STAMP benchmark suite in C++
  2. LabyrinthTxTk: source code (without transactional tasks, v2.5), uberjar (without transactional tasks, v2.5), source code (with transactional tasks, v3.2), uberjar (with transactional tasks, v3.2)
  3. BayesTxTk: source code (v9), uberjar (v9)
  4. Scripts and input files used to run benchmarks (branches labyrinth, bayes, and bayes-bar)
There are two ways to run the applications:

We have generated the uberjar for you, so that installing Leiningen is not needed. We still explain the how to do so manually below, for the interested reader.

How to compile the applications to an uberjar (optional)

  1. Install Leiningen by downloading the lein script on their website and putting in your $PATH.
  2. In the projects' source code directory, run lein uberjar. This generates a file target/uberjar/XXX-standalone.jar. This file is equivalent to the corresponding JAR in the stamp/ folder of this package.

How to run LabyrinthTxTk

There are two versions of the LabyrinthTxTk application:

These commands run the main function of the application, with on the class path ClojureTxTk and the application. If you generated the jars yourself, you might need to modify the paths.

These commands run the applications with the default parameters. You can pass the option --help to see which parameters can be varied. Most importantly, these are:

To generate the graphs in the paper (Figure 8 and 9), we ran:

java -cp clojure-1.6.0-txtk-2.3.jar:stamp/labyrinth-llcc-2.5.jar labyrinth.main -i stamp/inputs/random-x64-y64-z3-n32.txt -t 4
java -cp clojure-1.6.0-txtk-2.3.jar:stamp/labyrinth-pbcc-3.2.jar labyrinth.main -i stamp/inputs/random-x64-y64-z3-n32.txt -t 2 -a 8

Both t and a varied from 1 to 64 in powers of two.

How to run BayesTxTk

You can run the BayesTxTk application using:

$ java -cp clojure-1.6.0-txtk-2.3.jar:stamp/bayes-9.jar bayes.main

The option --help prints a list of possible options (and executes the application with the default parameters afterwards). The most important options are:

To generate the graphs in the paper (Figure 12), we ran:

java -cp clojure-1.6.0-txtk-2.3.jar:stamp/bayes-9.jar bayes.main -n 5 -r 512 -v 48 -t 4
java -cp clojure-1.6.0-txtk-2.3.jar:stamp/bayes-9.jar bayes.main -n 5 -r 512 -v 48 -t 4 --variations alternatives-parallel

varying t from 1 to 128.

TxTkRedex

We created a machine-executable implementation the operational semantics described in the paper, using PLT Redex.

  1. PLT Redex implementation

The different files define several "languages", corresponding to those described in the paper:

  1. Lb: the "base" language (based on Clojure) (file clj-base.rkt)
  2. Lf: the base language extended with support for futures (as in Clojure) (file clj-futures.rkt)
  3. Lt: Lf extended with support for transactions (as in Clojure) (file clj-tx.rkt)
  4. Ltf: Lt extended with support for transactional futures (not in Clojure) (file tx-futs.rkt)

Furthermore, each file contains some test cases illustrating how different example programs reduce. You can execute these files using DrRacket. For instance, tx-futs.rkt contains the following test case (named example-tx-futs-3):

(let [(r_0 (ref 0))
      (f_0 (fork
             (atomic
               (let [(x (fork (ref-set r_0 (+ (deref r_0) 1))))
                     (y (fork (ref-set r_0 (+ (deref r_0) 2))))]
                 (do
                   (join x)         ; => r_0 = original + 1
                   (join y)         ; => r_0 = original + 2
                   (deref r_0)))))) ; => returns original + 2
      (f_1 (fork
             (atomic
               (let [(x (fork (ref-set r_0 (+ (deref r_0) 3))))
                     (y (fork (ref-set r_0 (+ (deref r_0) 4))))]
                 (do
                   (join x)          ; => r_0 = original + 3
                   (join y)          ; => r_0 = original + 4
                   (deref r_0))))))] ; => returns original + 4
  (+ (join f_0) (join f_1)))

There are two possible outcomes of this program:

(term [[(f_0 8) (f_new1 6) (f_new 2)]
       [(r_new 6) (r_new 5) (r_new 2) (r_new 1) (r_new 0)]])
(term [[(f_0 10) (f_new1 4) (f_new 6)]
       [(r_new 6) (r_new 5) (r_new 4) (r_new 3) (r_new 0)]])

These terms contains two mappings. First, the list of futures and their end result, e.g. for the first outcome the root future f_0 results in 8, the first future f_new1 in 6 and the second future f_new in 2. Second, it contains the transactional heap, mapping transactional variables to their value. The transactional heap contains a history, e.g. for the first outcome r_new was initialized to 0, and held the values 1, 2, 5, and 6 over the execution of the program.

The two outcomes correspond to the two possible serializations of the given program: the first corresponds to transaction 1 executing first (end result is 8), the second to transaction 2 executing first (end result is 10).

Furthermore, PLT Redex can visualize these results using:

(traces ->tf example-tx-futs-3)

This is shown in the image in the image below (a larger version is in the file redex/example-tx-futs-3.pdf). You can see how different interleavings lead to different intermediate states, but they collapse into two final states, corresponding to the two possible serializations. Transactions (atomic blocks) are evaluated in a single step, illustrating the atomicity of a transaction.