Transactional tasks are a mechanism to introduce parallelism in a transaction. We extended Clojure with support for transactional tasks. In our fork of Clojure, futures can be created in a transaction, with access to the encapsulating transaction's state.
These artifacts accompany the paper "Transactional Tasks: Parallelism in Software Transactions". Paper abstract:
Many programming languages, such as Clojure, Scala, and Haskell, support different concurrency models. In practice these models are often combined, however the semantics of the combinations are not always well-defined. In this paper, we study the combination of futures and Software Transactional Memory. Currently, futures created within a transaction cannot access the transactional state safely, violating the serializability of the transactions and leading to unexpected behavior.
We define transactional tasks: a construct that allows futures to be created in transactions. Transactional tasks allow the parallelism in a transaction to be exploited, while providing safe access to the state of their encapsulating transaction. We show that transactional tasks have several useful properties: they are coordinated, they maintain serializability, and they do not introduce non-determinism. As such, transactional tasks combine futures and Software Transactional Memory, allowing the potential parallelism of a program to be fully exploited, while preserving the properties of the separate models where possible.
As part of the paper, we created three artifacts, described on this page:
This package contains the following files, discussed on this page:
ClojureTxTk is a fork of Clojure 1.6.0, extending Clojure with support for transactional tasks. You can get:
The implementation of transactional tasks is discussed in Section 6 of the paper. Some simplifications were made in the paper. The classes Transaction and TxTask in Listing 5 of the paper correspond to the files LockingTransaction.java and TransactionalFuture.java in src/jvm/clojure/lang.
You can either download the compiled JAR directly above (recommended), or download the source code (clojure-1.6.0-txtk-2.3-src.zip) and build the JAR yourself, using the following command:
$ mvn package
This requires Apache Maven. On the first run, Maven will download all necessary dependencies. This process should end in BUILD SUCCESS, generating a JAR that is stored in target/clojure-1.6.0.jar. This JAR is equivalent to the file clojure-1.6.0-txtk-2.3.jar included in this package. The command might take several minutes to complete, as it runs an extensive test suite, comprised of the test suite included in Clojure extended with additional tests for transactional tasks.
To start the REPL of ClojureTxTk, run:
$ java -cp clojure-1.6.0-txtk-2.3.jar clojure.main Clojure 1.6.0 user=>
Where clojure-1.6.0-txtk-2.3.jar refers to the file downloaded or generated in the previous step.
To run a ClojureTxTk program, for instance the file hello-world.clj, use the command:
$ java -cp clojure-1.6.0-txtk-2.3.jar clojure.main hello-world.clj Hello World
A simple example program using transactional tasks is included in hello-transactional-tasks.clj:
(def a (ref 0)) (dosync (let [f (future (ref-set a 1)) g (future (ref-set a 2))] (println "a starts 0:" (deref a)) @f (println "a is now 1:" (deref a)) @g (println "a is now 2:" (deref a)))) (println "a is 2 in the end:" (deref a))
Note: the artifact uses Clojure's syntax (e.g.
dosync), while in the paper we use a language-independent syntax (e.g.
atomic). These syntactical differences are discussed in Appendix A.1 of the paper.
This program will always produce the same output, due to the in-transaction determinacy of transactional tasks:
$ java -cp clojure-1.6.0-txtk-2.3.jar clojure.main hello-transactional-tasks.clj a starts 0: 0 a is now 1: 1 a is now 2: 2 a is 2 in the end: 2
For comparison, we have also included the original Clojure 1.6.0, downloaded from the Clojure website, in clojure-1.6.0-original.jar. Running the example program with the original Clojure leads to an error:
$ java -cp clojure-1.6.0-original.jar clojure.main hello-transactional-tasks.clj a starts 0: 0 Exception in thread "main" java.util.concurrent.ExecutionException: java.lang.IllegalStateException: No transaction running [...]
By modifying the program to encapsulate each
ref-set in another transaction (
(future …) to
(future (dosync …))), the error is avoided, but the end result may be 1 or 2. Furthermore, the futures may be executed multiple times. This modification does not influence the result when using ClojureTxTk: it is always 2.
Note: print statements in a transaction might be re-executed, as these are not undone when the transaction is rolled back.
Below we will run the Labyrinth and Bayes applications.
We ported two applications from the STAMP benchmark suite, Labyrinth and Bayes, from C++ to Clojure. Next, we added transactional tasks where they were suitable to increase the potential parallelism of the applications.
lein runcommand provided by Leiningen. This means you need to install Leiningen. It will automatically install all other dependencies of the applications on your machine.
lein uberjar, and then run that using Java. An uberjar contains the application and all of its dependencies. You need to install Leiningen to generate the uberjar, but you don't need it to run the application afterwards.
We have generated the uberjar for you, so that installing Leiningen is not needed. We still explain the how to do so manually below, for the interested reader.
lein uberjar. This generates a file target/uberjar/XXX-standalone.jar. This file is equivalent to the corresponding JAR in the stamp/ folder of this package.
There are two versions of the LabyrinthTxTk application:
$ java -cp clojure-1.6.0-txtk-2.3.jar:stamp/labyrinth-llcc-2.5.jar labyrinth.main -i stamp/inputs/random-x64-y64-z3-n32.txt
$ java -cp clojure-1.6.0-txtk-2.3.jar:stamp/labyrinth-pbcc-3.2.jar labyrinth.main -i stamp/inputs/random-x64-y64-z3-n32.txt
These commands run the main function of the application, with on the class path ClojureTxTk and the application. If you generated the jars yourself, you might need to modify the paths.
These commands run the applications with the default parameters. You can pass the option --help to see which parameters can be varied. Most importantly, these are:
To generate the graphs in the paper (Figure 8 and 9), we ran:
java -cp clojure-1.6.0-txtk-2.3.jar:stamp/labyrinth-llcc-2.5.jar labyrinth.main -i stamp/inputs/random-x64-y64-z3-n32.txt -t 4 java -cp clojure-1.6.0-txtk-2.3.jar:stamp/labyrinth-pbcc-3.2.jar labyrinth.main -i stamp/inputs/random-x64-y64-z3-n32.txt -t 2 -a 8
Both t and a varied from 1 to 64 in powers of two.
You can run the BayesTxTk application using:
$ java -cp clojure-1.6.0-txtk-2.3.jar:stamp/bayes-9.jar bayes.main
The option --help prints a list of possible options (and executes the application with the default parameters afterwards). The most important options are:
To generate the graphs in the paper (Figure 12), we ran:
java -cp clojure-1.6.0-txtk-2.3.jar:stamp/bayes-9.jar bayes.main -n 5 -r 512 -v 48 -t 4 java -cp clojure-1.6.0-txtk-2.3.jar:stamp/bayes-9.jar bayes.main -n 5 -r 512 -v 48 -t 4 --variations alternatives-parallel
varying t from 1 to 128.
We created a machine-executable implementation the operational semantics described in the paper, using PLT Redex.
The different files define several "languages", corresponding to those described in the paper:
Furthermore, each file contains some test cases illustrating how different example programs reduce. You can execute these files using DrRacket. For instance, tx-futs.rkt contains the following test case (named example-tx-futs-3):
(let [(r_0 (ref 0)) (f_0 (fork (atomic (let [(x (fork (ref-set r_0 (+ (deref r_0) 1)))) (y (fork (ref-set r_0 (+ (deref r_0) 2))))] (do (join x) ; => r_0 = original + 1 (join y) ; => r_0 = original + 2 (deref r_0)))))) ; => returns original + 2 (f_1 (fork (atomic (let [(x (fork (ref-set r_0 (+ (deref r_0) 3)))) (y (fork (ref-set r_0 (+ (deref r_0) 4))))] (do (join x) ; => r_0 = original + 3 (join y) ; => r_0 = original + 4 (deref r_0))))))] ; => returns original + 4 (+ (join f_0) (join f_1)))
There are two possible outcomes of this program:
(term [[(f_0 8) (f_new1 6) (f_new 2)] [(r_new 6) (r_new 5) (r_new 2) (r_new 1) (r_new 0)]])
(term [[(f_0 10) (f_new1 4) (f_new 6)] [(r_new 6) (r_new 5) (r_new 4) (r_new 3) (r_new 0)]])
These terms contains two mappings. First, the list of futures and their end result, e.g. for the first outcome the root future
f_0 results in 8, the first future
f_new1 in 6 and the second future
f_new in 2. Second, it contains the transactional heap, mapping transactional variables to their value. The transactional heap contains a history, e.g. for the first outcome
r_new was initialized to 0, and held the values 1, 2, 5, and 6 over the execution of the program.
The two outcomes correspond to the two possible serializations of the given program: the first corresponds to transaction 1 executing first (end result is 8), the second to transaction 2 executing first (end result is 10).
Furthermore, PLT Redex can visualize these results using:
(traces ->tf example-tx-futs-3)
This is shown in the image in the image below (a larger version is in the file redex/example-tx-futs-3.pdf). You can see how different interleavings lead to different intermediate states, but they collapse into two final states, corresponding to the two possible serializations. Transactions (
atomic blocks) are evaluated in a single step, illustrating the atomicity of a transaction.