This artifact accompanies the paper “Multiverse Debugging: Non-deterministic Debugging for Non-deterministic Programs”. In the paper we argue that parallelism has become an integral part of modern software ranging from large-scale server code to responsive web applications and networked embedded systems. While a wide range of high-level concurrency abstractions are available for developers, understanding and debugging parallel programs remains challenging. The main reason why parallel programs are so difficult to debug is due to their non-determinism. The state of a parallel program at any given moment in time can alter to one of many possible successor states. As a consequence, it is very difficult to reason about their behavior and to reproduce bugs as they may only manifest in rare execution traces.
In the paper, we propose multiverse debugging, a novel debugging technique for parallel programs which combines online breakpoint-based debugging with state exploration from static techniques. The key idea of multiverse debugging is that non-deterministic programs require non-deterministic debugging. Contrary to current state-of-the-art debuggers, which only execute the program in one execution path (i.e. one universe), a multiverse debugger can observe all possible universes. A multiverse debugger is itself a non-deterministic program which is able to explore all possible states of a parallel program while leveraging breakpoints and stepping commands of online debuggers to interactively search for the root cause of a bug. This means that regular breakpoints become multiverse breakpoints which are potentially triggered multiple times in different universes. As such, a multiverse debugger ensures that if a bug is in the program, it will be observed during the debugging session.
The main contributions of the paper are:
The first contribution is a conceptual contribution and thus it is not backed by the artifact. This artifact contains:
This artifact’s main focus is on explaining Voyager, a proof-of-concept multiverse debugger for AmbientTalk programs. The goal of this tool is to give developers a first impression of what it would feel like to interactively debug a non-deterministic program with a multiverse debugger.
In this artifact, we provide:
VoyagerECOOP2019.ova
).AmbientTalk-Debugger
” stored in the virtual machine at /home/osboxes/demo/DebuggerPaperExample.zip
.The artifact is available for download as VoyagerECOOP2019.ova
at
https://drops.dagstuhl.de/opus/volltexte/2019/10781/
The ova
-file has been tested in Virtual Box versions 5.2.22 and 6.0.4. The virtual machine contains a Debian 4.9.144-3.1 x86_64 GNU/Linux and uses Firefox 60 internally to render the results. The back end within the VM uses Racket v7.2, node 10 and ArangoDB 3.4.4.
The artifact is available under the MIT License.
a11ad9a590566bf6f5af3393aecd80f8
3.4 GiB
Voyager is a web application which executes the operational semantics of a multiverse debugger written in PLT-Redex. Starting the web service, launches the web accessible user interface and an ArangoDB database (in which Voyager stores the program state graph for later querying).
As explained in the paper, in order to build a multiverse debugger, we provide a two-step recipe which entails to define (1) the operational semantics of a non-deterministic base language and then (2) a debugger configuration in terms of the base language semantics. In this artifact, we employ the operational semantics of the AmbientTalk language and its debugger semantics, both of which are implemented in PLT-Redex. As such, target programs are written in PLT-Redex syntax and can be loaded in Voyager for debugging. Voyager asks PLT-Redex to reduce the program according to the debugger semantics, which results in the reduction graph for the target program. All states in this graph are stored in a graph database (i.e ArangoDB) for easy manipulation and exploration of the reduction graph.
Before we detail the functionality of our proof-of-concept multiverse debugger, we provide installation instructions to access the artifact using Virtual Box. Docker users can find dedicated instructions in [@sec:dockerInstall].
We make use of Virtual Box to make this software artifact accessible. In order to install Voyager in Virtual Box, download the Virtualization Format archive (3.1 GiB) here and follow the instructions below.
VoyagerECOOP2019.ova
file.http://localhost:3000/
. If Firefox states that the site is not reachable, click Try again
.Note: A regular Virtual Box user does not need to input any password to reach the end of the installation phase. If you get logged out, you can log in with “demo” as password and user “demo”.
As mentioned, the Voyager user interface is accessible through a modern web browser such as the latest version of Firefox. The default address for the Voyager web service is http://localhost:3000/
.
A screenshot of the Voyager debugger when started is shown [@fig:voyager-overview].
Voyager’s UI consists of two regions: a status and control region on the left-hand side of the web page, and a canvas on the right-hand side one. We will now focus on explaining all the UI elements of Voyager and their basic functionality. The next section shows in detail a debugging session in Voyager using a concrete Use Case.
When the Voyager debugger is started, the user can perform one of five tasks: (1) select a previously loaded program to debug, (2) upload a new program, (3) execute a query over the target program being debugged, (4) upload a new version of the execution semantics of the debugger and (5) adjust the properties of the canvas rendering the execution graph.
In order to quickly test the functionality of the debugger the user can select the program AT - double - MsgReceiverBreakpoint
from the drop-down menu. When clicking on this example program, the canvas region will show a graph.
This graph represents the possible execution states of a program that has a message receiver breakpoint. Specifically it corresponds to the code we will later detail in the use case section.
In order to get a more complete picture of the execution graph the user can zoom out, zoom in, and drag the graph. To zoom in or out, scroll while the mouse is in the canvas. The screenshot below shows the Voyager debugger when the example program has been loaded and the user has zoomed out to get an overview of the graph. The blue nodes in the graph denotes states of the target program and the pink nodes are states in which the debugger is paused.
The graph is automatically styled by moving the nodes in a direction from left to right. Over time the force on these nodes is decreased until they reach a steady state. As shown in the screenshot above, the Voyager also features a menu with 4 buttons in the top right corner of the canvas to aid the developer move and explore the execution graph of the target program.
Recall that the pink nodes in the graph are states in which the debugger is paused. In order to resume execution the programmer can click on the node to reveal a circular menu. With this menu the programmer can execute a number of actions (not all are applicable for each node): (1) run the program, (2) pause the program, (3) step-message-receiver, (4) step-next-turn, (5) pin or unpin the node, and (6) get more information about the node. The icons for those actions in the circular menu are shown below.
\
A node may have a gray border, this indicates that it has not yet been expanded. To expand a node you can click on it, a similar menu to the one above will appear. Click expand button to notify Voyager that it should continue reducing from that node. If multiple nodes need to be expanded a shortcut can be used: Select a node that is an ancestor of the nodes you want to expand and double-click on it while holding shift. This will expand all nodes below the clicked node. Nodes that are being expanded are shown in orange.
Recall that programs to be debugged in Voyager must be written in PLT-Redex. New programs can be added by using the “Create new Program” button . Clicking this button shows a text editor in which a PLT-Redex program term can be typed. Alternatively, there is a “Browse…” button in the top right corner which allows a file containing a PLT-Redex program to be uploaded.
As previously mentioned, the Voyager debugger stores the state graph in a graph database which we can use to query the state of the multiverse graph. The left pane of the Voyager UI has a button to create new queries . After clicking that button, the user is presented with the following view.
The language to query the database is the ArangoDB Query Language (AQL). The breakpoints and stepping operations combined with the query facilities of Voyager provide an interactive experience of browsing the multiverse graph of a program to find the root cause of bugs. [@Fig:voyager-query] shows the UI with an AQL query to find the shortest path to all stuck terms. We will further describe it in [@sec:path-to-all-stuck-terms].
def makeMath() {
actor:{
def result := 0;
def double(x){result := x+x};
def getResult(){result};
}
};
def makeClient1(math){
actor:{ |math|
def start(){
math<-double(12);
when: math<-getResult()@FutureMessage becomes: {|res|
system.println(res);
}}}
};
def makeClient2(math){
actor: { |math|
def start(){ math<-double(33) }}
};
def math := makeMath();
def client1 := makeClient1(math);
def client2 := makeClient2(math);
client1<-start();
client2<-start();
: AmbientTalk program containing a bad message interleaving bug {#lst:at-prog}
We now show a debugging session in Voyager for the AmbientTalk program depicted in @lst:at-prog. The program shows an interaction between 3 actors: a math
actor (created in line 20), and two client actors, client1
(created on line 21) and client2
(line 22).
The math
actor understands two messages:
double
message, which doubles its argument and stores the result for further operations,getResult
message, which returns the result of a number of operations.After creating the three actors, the program sends a start
message to
both client actors. As a result, client1
sends a ` double(12) message followed by a
getResult one. Concurrently,
client2 sends a
double(33)` message to the math actor as well.
Despite being a simple program, it contains a bad message interleaving bug, which is common for actor-based programs. It is possible that client1
gets the result of doubling 33
instead of doubling 12
. The table below lists all possible interleavings with the possible results client 1 gets. In particular, the program has 3 possible interleavings ( shown in 3 columns), and each row shows the sender actor & message sent at a particular time to the math actor. Time progresses from top to bottom of the table (i.e. row i shows the message sent at time i, and row i + 1 shows a message sent at time i+1).
Faulty Interleaving | Correct Interleaving | Correct Interleaving |
---|---|---|
client 1 - double(12) |
client 1 - double(12) |
client 2 - double(33) |
client 2 - double(33) |
client 1 - getResult() → 24 |
client 1 - double(12) |
client 1 - getResult() → 66 |
client 2 - double(33) |
client 1 - getResult() → 24 |
We will now explore the program with Voyager. First, we will execute the program in Voyager without any breakpoint. Recall that Voyager expects a PLT-Redex term in the debugger semantics as input. As such, we first need to convert the AmbientTalk code from the previous section to the equivalent PLT-Redex program. The transformed program is shown in [@Lst:input]. This can also be found in the AT_double_NoBreakpoint.txt
file stored in the demo
folder in the artifact.
(
() ; Pending breakpoints
() ; Checked breakpoints
run ; Debugger state
()
() ; Commands (user interaction)
((client1 run)) ; Actor map
((actor ; Base language term
client1
()
()
(let (math (actor
(field result 0)
(method double x (set! (this $ result) (+ x x)))
(method result p (this $ result))))
in
(let (client2 (actor
(method start math (send math double (33) c2-double-to-math))))
in
(let (a (send client2 start (math) c1-start-to-c2)) in
(let (b (send math double (12) c1-double-to-math)) in
(let (x_f x_r) future in
(let (x_l (
let (some-var 5) in
(object (method apply x ((x_r $ resolve-mu) x))))
) in
(let (var (send (let (x_f1 x_r1) future in
(let (var (send math result (0 x_r1) c1-result-to-math)) in x_f1))
register-mu (x_l) c1-result-to-math)) in x_f)))))))
))
)
: AmbientTalk program in PLT-Redex in debugger semantics {#lst:input}
To load and execute this program in Voyager, take the following steps:
AT_double_NoBreakpoint.txt
by using the “Browse…” button in the top right, and browsing the file at the demo
folder.Alternatively, instead of loading and executing the program manually, you can just select the AT - double - NoBreakpoint
program from the Select program drop-down menu on the left-hand side.
The screenshot in [@Fig:voyager-filled] shows the UI after copy-pasting the program. In this case, the end user copy pasted the code and named the program “A failing program”.
Once the execution is complete, a graph representing the program execution is shown. As stated before, you can zoom in and out by scrolling while your mouse is on the canvas. Placing the mouse on a node of the graph will show extra information about that node in the right-hand pane. A red node represents a state that can not be reduced any further. Placing the mouse over all red nodes, you can see that two of them have future:x_id = 24
and one has future:x_id = 66
. The latter represents the faulty execution path where the result incorrectly becomes 66. The expected state of the UI after this step is depicted in [@fig:voyager-result].
Let us now query the execution graph to inspect the execution states and try to understand how we got an incorrect result. To query the graph, click the new query button , and paste a query. [@Lst:aqlStuck] shows a query that filters all paths except the shortest path from the start node to nodes that cannot be reduced any further. This give us a more simplified view on the multiverse.
// make an array containing all stuck nodes
LET stuckNodes = (FOR n in @@nodes FILTER n._stuck == true RETURN n)
// find the path to each of them and join the results
LET path = FLATTEN(
FOR target IN stuckNodes
FOR n,e IN OUTBOUND SHORTEST_PATH
@start TO target._id GRAPH @graph
RETURN {n,e})
// convert to the needed format
RETURN {
edges: (FOR d in path FILTER d.e != null RETURN DISTINCT d.e),
nodes: (FOR d in path FILTER d.n != null RETURN DISTINCT d.n)
}
: AQL query to find the shortest path to all stuck terms {#lst:aqlStuck}
The used query language is pure AQL using Bind parameters (@start
, @graph
and @@nodes
). Let us have a closer look to what the query is actually doing:
First, it finds all stuck nodes, by querying the database for nodes that have been marked as such. The FOR
operation returns an array of the values that have been returned by RETURN
in its body. The _stuck
is the same _stuck
as in the table that is shown on the right side when placing the mouse over a node.
SHORTEST_PATH
. This primitive gives us an array of the nodes and edges on the shortest path between two nodes in a graph. Because we use SHORTEST_PATH
for each of the stuck nodes, we end up with an array of arrays which is then flattened. The path
variable now holds an array of nodes and edges on the shortest path from the start node to a stuck node.The expected result of querying with [@lst:aqlStuck] is shown in [@fig:filter]. Hovering over the nodes around the forks shows us the first moment that the paths diverge. Note also that the screenshot was taken when the user had the mouse on the faulty red node (as the left-hand side panel shows the future:x_id = 66
).
We can add a condition to the query described in the previous section to only match nodes that have value “66”, i.e. the wrong result. The altered query is shown in [@lst:aql66].
// find all stuck nodes that have value 66
LET stuckNodes = (FOR n in @@nodes
FILTER n["future:x_if"] == "66" && n._stuck
RETURN n)
// find path to each of them and join the results
LET path = FLATTEN(
FOR target IN stuckNodes
FOR v,e IN OUTBOUND SHORTEST_PATH
@start TO target._id GRAPH @graph
RETURN {v,e})
// convert to needed format
RETURN {
edges: (FOR d in path FILTER d.e != null RETURN DISTINCT d.e),
nodes: (FOR d in path FILTER d.v != null RETURN DISTINCT d.v)
}
: AQL query to find the shortest path to all stuck terms with value 66 {#lst:aql66}
With this new query we only see a single wrong execution, one that results in 66. The expected result of the adapted query is shown in [@fig:allFaulty]
We now look at how breakpoints behave in the multiverse debugger. We will add a breakpoint that pauses execution when the message sent by client1
arrives in the mailbox of the math
actor1. This corresponds to a message receiver breakpoint, represented as (Msg-Receiver c1-double-to-math)
, where c1-double-to-math
is the id of the message in [@lst:with-breakpoint]. We add this breakpoint to the list of pending breakpoints. The new program is shown in [@lst:with-breakpoint]. It is constructed by applying following change to line 2 of [@lst:input].
Replace
() ; Pending breakpoints
with
((Msg-Receiver c1-double-to-math)) ; Pending breakpoints
(
((Msg-Receiver c1-double-to-math)) ; Pending breakpoints
() ; Checked breakpoints
run ; Debugger state
()
() ; Commands (user interaction)
((client1 run)) ; Actor map
((actor ; Base language term
client1
()
()
(let (math (actor
(field result 0)
(method double x (set! (this $ result) (+ x x)))
(method result p (this $ result))))
in
(let (client2 (actor
(method start math (send math double (33) c2-double-to-math))))
in
(let (a (send client2 start (math) c1-start-to-c2)) in
(let (b (send math double (12) c1-double-to-math)) in
(let (x_f x_r) future in
(let (x_l (
let (some-var 5) in
(object (method apply x ((x_r $ resolve-mu) x))))
) in
(let (var (send (let (x_f1 x_r1) future in
(let (var (send math result (0 x_r1) c1-result-to-math)) in x_f1))
register-mu (x_l) c1-result-to-math)) in x_f)))))))
))
)
: AmbientTalk program in PLT-Redex in debugger semantics with a message receiver breakpoint {#lst:with-breakpoint}
The new program comes preloaded as “AT - double - MsgReceiverBreakpoint
” and can be selected from the programs drop-down on the left. To alternatively start the program with a breakpoint yourself, take the same steps as in [@sec:explore-voyager]:
The expected result of executing this program is shown in [@fig:voyager-pre-expand]. We see that there are pink nodes, these nodes represent paused states of the debugger. Clicking on them reveals a menu with possible actions:
Step-next-Turn
, halting just before processing the next message (by the math actor) ;Step-Msg-Receiver
(not implemented);Clicking on the resume button of one of the paused (pink) nodes, calculates and shows how the program resumes from that point. In [@fig:voyager-post-expand] the result of such a resumption is shown. Notice that the dashed lines are used to indicate user interaction. When exploring this view you may encounter nodes that are not yet expanded. Those have a gray border. You can expand them by double clicking on them while holding shift, or by selecting in their menu.
The step-next-turn button of paused (pink) nodes will stop execution of the math actor just before processing a message. Note that when it is clicked when no next message can arrive, the execution will halt just before processing the never-arriving message, this shows up in Voyager as a red node.
So far, we have shown our proof of concept of a multiverse debugger for AmbientTalk programs called Voyager. As mentioned before, Voyager uses as input a PLT-Redex program implemented in the AmbientTalk operational semantics and gives as output the reduction graph corresponding to all possible universes of the program. We focused on how developers can interactively explore non-deterministic execution paths corresponding to the evaluation of an AmbientTalk program. To this end, Voyager offers features from classic breakpoint-based debuggers like breakpoints and stepping commands complemented with queries over the whole execution graph to help developers interactively debug the multiverse.
This proof of concept debugger builds on top of an operational semantics that uses as base language the AmbientTalk calculus. The Voyager debugging semantics has allowed us to reason about how to halt and resume actors using multiverse breakpoints together with stepping commands. All the semantics files are included in the software artifact itself at /home/osboxes/demo/DebuggerPaperExample.zip
which contains:
- AmbientTalk-ReductionRules.rkt
- AmbientTalk-Syntax.rkt
- AmbientTalk-Util.rkt
- Debugger-ReductionGraphs.rkt
- Debugger-ReductionRules.rkt
- Debugger-Syntax.rkt
- Debugger-Util.rkt
- PLTGraphRedex.rkt
AmbientTalk-XX.rkt
correspond to the original semantics of the language (as described in reference [12] of the paper). This is not part of the contributions of this artifact.Debugger-XX.rkt
correspond to the Voyager debugging semantics on top of the AmbientTalk ones (which is described in section 5 of the paper).PLTGraphRedex.rkt
provides the glue between the PLT-Redex semantics and Voyager’s front end.The file Debugger-Syntax.rkt
contains the syntax definition corresponding to the debugger. The debugger syntax is implemented as an extension of the AmbientTalk language semantics.
A debugger state D
consists of seven elements, ((BP ...) (BP ...) action H (C ...) actormap k))
. Note that this is different from the paper semantics where we only have six elements (as we do not mention the history H in the paper):
(BP ...)
and (BP ...)
are respectively the list of pending breakpoints and the list of already checked breakpoints.action
for representing the state run
and pause
. When the debugger is in the run
state it verifies whether there is an applicable breakpoint. When a breakpoint hits, the debugger transitions itself to the pause
state and halts execution.H
, this is no longer used in the semantics and will be removed in the future.C
.actormap
.K
is the state of the actor configuration being debugged, i.e. the state of the AmbientTalk program.The file Debugger-ReductionRules.rkt
contains the reduction rules of the multiverse debugger for AmbientTalk. As explained in the paper, the reduction rules can be separated in five groups:
CEL-Step-Global
, CEL-Step-Local
and CEL-Step-Local-step-1
correspond to the two rules CEL-Step-Global
and CEL-Step-Local
from the paper’s section 5.3.1. In the paper, we have a more general update
meta function which makes the semantics more elegant.Save-Message-Receiver-Breakpoint(Save-MRB)
and Trigger-Message-Receiver-Breakpoint (Trigger-MRB)
. They loosely correspond with (Save-MRB)
and (Trigger-MRB)
from section 5.3.2 in the paper.Not-Applicable Save-MRB
,Not-Applicable Trigger-MRB
, in order for the semantics to proceed in case a breakpoint is not applicable. These rules correspond to the compressed rule Not-Applicable-Breakpoint[trigger-msb,save-mrb,trigger-mrb]
from section 5.3.3 in the paper. The bookkeeping rule Add-New-Actor
corresponds with the rule with the same name in the paper.Trigger-Step-Next-Turn (Trigger-SNT-from-MRB)
and Prepare-Step-Next-Turn (Prepare-SNT-from-MRB)
PLT-Redex semantics correspond with the rules of the same name in section 5.3.4 of the paper.Resume
corresponds with the rule (Resume-Execution)
in section 5.3.5 in the paper. The rule (Pause-Execution)
from section 5.3.5 is currently not modeled in the PLT-Redex semantics.Finally, the file Debugger-Util.rkt
contains meta-functions used by the reduction rules of the multiverse debugger.
\appendix
Users who have docker installed can start the application directly with docker. The following commands will download and run the voyager image.
docker pull beardhatcode/voyager:ecoop2019;
docker run --rm -p 3000:3000 beardhatcode/voyager:ecoop2019;
Once running, point a Firefox web browser at http://localhost:3000/. Note: The docker installation does not contain the debugging semantics zip, nor does it contain the demo files. However, the examples of [@lst:input;@lst:with-breakpoint] come preloaded.
This message is sent on line 20 of [@lst:input;@lst:with-breakpoint]. ↩