Abstract. Verifying fine-grained optimistic concurrent programs remains an open problem. Modern program logics provide abstraction mechanisms and compositional reasoning principles to deal with the inherent complexity. However, their use is mostly confined to pencil-and-paper or mechanized proofs. We devise a new separation logic geared towards the lacking automation. While local reasoning is known to be crucial for automation, we are the first to show how to retain this locality for (i) reasoning about inductive properties without the need for ghost code, and (ii) reasoning about computation histories in hindsight. We implemented our new logic in a tool and used it to automatically verify challenging concurrent search structures that require inductive properties and hindsight reasoning, such as the Harris set.
git clone --branch OOPSLA22 https://github.com/Wolff09/plankton.git plankton \
&& cd plankton \
&& docker build . -t plankton \
&& docker run -it plankton
See Compilation and Installation
python3 benchmark.py
plankton is an experimental verifier for lock-free data structures. It verifies linearizability of input programs by constructing a Hoare-style proof in a novel program logics. Crucially, this program logic supports (i) reasoning about inductive properties and (ii) hindsight reasoning. Both techniques are performed in lock step with the program, avoiding the need for ghost code. To localize the proofs, the program logic incorporates (iii) flow reasoning.
plankton takes as input the program under scrutiny and a candidate node invariant. It then fully automatically generates a proof within our novel program logic, establishing that the given program is linearizable and does adhere to the given invariant.
A discussion on how to reproduce our experiments follows. Therefore, we walk through the installation, compilation, and benchmarking. In the end, we briefly link the concepts and techniques from the paper to our implementation.
This section describes how to setup your system to repeat the benchmarks from the paper.
The source code can be downloaded from GitHub:
GitHub release with tag OOPSLA22
Alternatively, you may wish to clone/download a potentially different version from GitHub: GitHub repository
To compile this project you need CMake (>3.21), a GCC and G++ (>10.0), Java JRE (>1.6), and Z3 (v4.8.7). Compilation with CLAN (>12.0) is also possible. To run the benchmarking script, you need Pyton 3 (>3.5).
For reproducing the results, it is paramount to use the correct Z3 version: v4.8.7.
Versions >4.8.7 generally fail to run z3::solver::consequences()
.
If z3::solver::consequences()
fails, plankton issues a warning and uses a backup procedure relying on z3::solver::check()
.
The backup procedure tremendously decreases performance, hence increases the runtime.
If you compile Z3 yourself, make sure to use GCC/G++ and not CLANG. We recommend to compile with GCC/G++ version 7.
To build the project, do the following:
cd path/to/plankton
mkdir build
cd build
CXX=/path/to/CXX-compiler cmake .. -DCMAKE_INSTALL_PREFIX=/path/to/installation
CXX=/path/to/CXX-compiler make
make install
This creates a folder /path/to/installation/plankton
at the specified location.
This folder contains all binaries, scripts, and examples needed to reproduce the evaluation from the paper.
We recommend to use the -DCMAKE_INSTALL_PREFIX
to specify a location where to install the tool.
Omitting this flag will install the tool somewhere like /usr/local/
.
You can customize the folder that is appended to /path/to/installation
by setting -DINSTALL_FOLDER
appropriately in the cmake invocation.
You can omit the -DCMAKE_CXX_COMPILER
option to use your system's default compiler.
There are some known issues you might run into.
AntLR
, however, your system might miss some dependencies for it to build successfully.
On Linux (tested with Ubuntu >18.04), make sure you have uuid
, uuid-dev
, and pkg-config
installed.
AntLR
fetches the library utfcpp
from GitHub.
In some (mainly academic) network setups this may fail and time out.
The problem can be solved by forcing git
to use https
like so:
git config --global url."https://".insteadOf git://
clang
and libc++
clang
, you might run into problems compiling and linking AntLR
.
To solve this (tested with Ubuntu >18.04), install libc++-dev
and libc++abi-dev
.
Then, create a compiler wrapper stdclang
like so
sudo touch /usr/bin/stdclang
sudo chmod a+x /usr/bin/stdclang
Edit the created file /usr/bin/stdclang
to contain this:
#!/bin/bash
clang++ -stdlib=libc++ $@
With this, use stdclang
as the CXX
compiler in the above build commands; it properly sets up clang++
to use libc++
.
The repository comes with a Dockerfile
that creates a Docker container with the necessary dependencies and an installation.
To use it, you can something like this:
cd path/to/plankton
docker build . -t plankton
docker run -it plankton
The Docker container contains two versions of plankton, a release
version at /artifact/plankton/
and a debug
version with additional output during proof generation at /artifact/plankton-debug/
.
We describe how to repeat the benchmarks from the paper for verifying lock-free set data structures.
We evaluated our tool on the following implementation:
To run the benchmarks go to the installation folder and execute the benchmark script like so:
cd path/to/installation
python3 benchmark.py
Make sure to use python3
.
Our test script uses the timeout feature of the subprocess
library which is available only in version 3.5 and newer.
benchmark.py
scrip will report failure, however, running plankton directly in the shell results in successful verification.
The issue is due to Z3 spontaneously misbehaving when used with Python's subprocess
library.
The benchmark.py
script runs each benchmark for 5 times, with a timeout of 6 hours per run.
To modify the repetition count or the timeout, pass the appropriate command line arguments as shown below.
You may also abort benchmark.py
(ctrl+c
) which stops the benchmarking and produces the average of the benchmarks covered so far (the benchmarks are run in a round robin fashion so that preliminary results are more likely to cover the entire benchmark suite).
python3 benchmark.py [<number_of_repetitions>] [<timeout_in_seconds>]
The benchmark script will output a table with the following columns:
We repeat our evaluation. We used an Apple M1 Pro with 16GB of RAM. We mark successful tasks with , failed tasks with , and timed out tasks with .
x86/amd64
hardware will be slower than the Apple Silicone runtimes reported here.
However, we expect traditional hardware to produce the same tendencies and most importantly the same verification verdict.
Program (on Apple M1 ) |
Iter | Eff | Cand | Com | Fut | Hist | Join | Inter | Linearizability |
---|---|---|---|---|---|---|---|---|---|
Fine-Grained set | 2 | 5 | 2 | 11% | 15% | 43% | 15% | 8% | 46s |
Lazy set | 2 | 6 | 2 | 10% | 13% | 54% | 11% | 5% | 77s |
FEMRS tree (no maintenance) | 2 | 5 | 2 | 19% | 0% | 49% | 1% | 9% | 130s |
Vechev&Yahav's 2CAS set | 2 | 3 | 1 | 14% | 0% | 33% | 31% | 9% | 125s |
Vechev&Yahav's CAS set | 2 | 4 | 1 | 15% | 7% | 39% | 23% | 6% | 54s |
ORVYY's set | 2 | 3 | 0 | 17% | 0% | 40% | 26% | 6% | 47s |
Michael's set | 2 | 4 | 2 | 11% | 29% | 30% | 15% | 6% | 306s |
Michael's set (wait-free) | 2 | 4 | 2 | 11% | 28% | 30% | 15% | 6% | 246s |
Harris' set | 2 | 4 | 2 | 7% | 8% | 19% | 32% | 4% | 1378s |
Harris' set (wait-free) | 2 | 4 | 2 | 8% | 10% | 17% | 34% | 3% | 1066s |
We also stress tested plankton by running versions of our benchmark programs with purposely inserted bugs. As expected, verification fails in all those cases. Below is a list of inserted bugs. If applicable, we inserted the bug into all the benchmark programs. Note that plankton does not implement error explanation techniques. Hence, the reason for verification failure reported by plankton may not reflect nor hint on the actual bug.
NULL
check during traversals.
The stress test suite can be repeated with the stresstest.py
script.
This script is similar in usage to the stresstest.py
script.
python3 stresstest.py [<number_of_repetitions>] [<timeout_in_seconds>]
Found bugs are indicated by . If verification succeeds for a buggy test case, is printed.
Instead of using the benchmark script, you can directly interact with plankton.
You invoke plankton like so:
./plankton [-g] [--no-spurious] <path_to_program>
The flags and their arguments are listed in the following table (excerpt).
Flag / Argument | Optional | Default | Description |
---|---|---|---|
<path_to_program> |
no | — | Input program file to analyze. For examples, See Benchmark Programs |
--gist |
yes | false |
Outputs a machine readable (read: easy to parse) string at the end of execution, informing about the overall runtime and linearizability verdict. |
--no-spurious |
yes | false |
Instructs the internal rewrite engine to prevent desugared CAS commands to fail spuriously.
Note that the C++ standard explicitly allows CAS to fail spuriously.
|
--loopNoPostJoin |
yes | false |
Instructs proof construction to perform a join over the postannotations of loops. This increases the precision of proof. |
Besides the above, --version
prints the tool's version, -h
and --help
print the full usage description.
Note that --loopNoPostJoin
flag is required in order for the FEMRS tree to be verified successfully.
The remaining benchmarks do not require the increased precision.
The benchmark.py
script passes this flag automatically.
plankton supports a simple C-style input language. Below you find an example program that showcases the required components and the most common constructs. (Note that the below example may not be linearizable.) For a list of all supported constructs, refer to the ANTLR parser definition here.
The following table links the techniques presented in the paper to the implementation.
Paper | Implementation | Description |
---|---|---|
Programs, Logics |
include/programs/ast.hpp
include/logics/ast.hpp
|
Abstract syntax tree for programs and the logics. |
Flows, Invariants |
include/engine/config.hpp
|
Configuration object that specifies the flow and the node invariants. Configuration objects are taken (parsed) from the input program. Currently, the flow is restricted to keyset flows. |
Program Simplification |
src/parser/*
|
The parser supports a richer input language that is transformed into the one defined in include/programs/ast.hpp .
|
Commands |
src/engine/solver/post_*.cpp
|
Implementation of Rule (COM-SEM) .
There are specialized implementations for assignments to variables (post_assign_var.cpp ), assignments to memory cells (post_assign_write.cpp ), assumptions (post_assume.cpp ), and allocation (post_malloc.cpp ).
|
Loops |
src/engine/proof/stmt.cpp
|
Implementation of Rule (LOOP) .
|
Flow Graphs |
include/engine/flowgraph.hpp
src/engine/flowgraph/make.hpp
|
Definition and generation of flow graphs. Flow graphs are generated wrt. a memory assignment by taking a fixed, finite number of reachable nodes starting from the nodes participating in the assignment. |
Histories |
src/engine/solver/past.cpp
|
Implementation of history introduction and reasoning.
Note that this implements the how.
To see when this is performed, refer to src/proof/common.hpp .
|
Futures |
src/engine/solver/future.cpp
|
Implementation of future introduction and reasoning.
Note that this implements the how.
To see when this is performed, refer to src/proof/common.hpp .
|
Interference |
src/engine/solver/stability.cpp
|
Applies interference (i.e. effects on the shared heap) such that annotations become stable (i.e. interference-free).
Note that this implements the how.
To see when this is performed, refer to src/proof/common.hpp .
|
Join |
src/engine/solver/join.cpp
|
Implementation of the join operation, which deals with now, past, and future predicates. Resource-like entities (variable bindings, points-to predicates, proof obligation resources) are matched manually; the remaining entities are handled semantically by encoding them into SMT. |
Entailment |
src/engine/solver/implication.cpp
|
Entailment operation, which deals with now, past, and future predicates. Resource-like entities (variable bindings, points-to predicates, proof obligation resources), futures, pasts are matched syntactically; only the now stack is handled semantically by encoding it into SMT. |
Encoding |
include/engine/encoding.hpp
src/engine/encoding/*
|
SMT encoding of now predicates. The encoding does not handle resource-like entities (variable bindings, points-to predicates, proof obligation resources). |
Fixed Point |
src/engine/proof/api.cpp
src/engine/proof/common.cpp
|
Implements the fixed point (src/engine/proof/api.cpp ) saturating the interference by repeatedly generating proofs for the API functions of the input program and extracting new inferferences from the generated proof.
The generated proof is guaranteed to be interference-free: interferences are applied to the generated annotation unless a command is a right-mover (src/engine/proof/common.cpp ).
Past and future reasoning is applied whenever interference is applied or as a pre-processing step of the join operation (src/engine/proof/common.cpp ).
|
Linearizability |
src/engine/proof/api.cpp
src/engine/solver/post_assign_write.cpp
src/engine/solver/ful.cpp
src/engine/encoding/spec.cpp
|
Linearizability is checked for every proof generated for API function (src/engine/proof/api.cpp ).
The check is performed directly for impure memory assignment (src/engine/solver/post_assign_write.cpp ) or using past (hindsight) reasoning (src/engine/solver/fut.cpp ) on top of a function's proof (src/engine/proof/api.cpp ).
Whether or not the specification is met, is checked by encoding the keyset theorem in SMT (src/engine/encoding/spec.cpp ).
|