Institute of Theoretical Computer Science Analysis of Computer Systems Group

A Concurrent Program Logic with a Future and History

Roland Meyer, TU Braunschweig, Germany
Thomas Wies, New York University, USA
Sebastian Wolff, New York University, USA

Accepted for OOPSLA'22.

Abstract. Verifying fine-grained optimistic concurrent programs remains an open problem. Modern program logics provide abstraction mechanisms and compositional reasoning principles to deal with the inherent complexity. However, their use is mostly confined to pencil-and-paper or mechanized proofs. We devise a new separation logic geared towards the lacking automation. While local reasoning is known to be crucial for automation, we are the first to show how to retain this locality for (i) reasoning about inductive properties without the need for ghost code, and (ii) reasoning about computation histories in hindsight. We implemented our new logic in a tool and used it to automatically verify challenging concurrent search structures that require inductive properties and hindsight reasoning, such as the Harris set.

Quick Start
  1. Install Docker.
  2. Get the thing running in a Docker container:
    git clone --branch OOPSLA22 https://github.com/Wolff09/plankton.git plankton \
      && cd plankton \
      && docker build . -t plankton \
      && docker run -it plankton
    See Compilation and Installation
  3. Run the benchmarks in the Docker container: python3 benchmark.py
    See Benchmarking
  4. Presto!

Tool Overview

plankton is an experimental verifier for lock-free data structures. It verifies linearizability of input programs by constructing a Hoare-style proof in a novel program logics. Crucially, this program logic supports (i) reasoning about inductive properties and (ii) hindsight reasoning. Both techniques are performed in lock step with the program, avoiding the need for ghost code. To localize the proofs, the program logic incorporates (iii) flow reasoning.

plankton takes as input the program under scrutiny and a candidate node invariant. It then fully automatically generates a proof within our novel program logic, establishing that the given program is linearizable and does adhere to the given invariant.

A discussion on how to reproduce our experiments follows. Therefore, we walk through the installation, compilation, and benchmarking. In the end, we briefly link the concepts and techniques from the paper to our implementation.


Compilation and Installation

This section describes how to setup your system to repeat the benchmarks from the paper.

Download

The source code can be downloaded from GitHub:    GitHub release with tag OOPSLA22

Alternatively, you may wish to clone/download a potentially different version from GitHub:    GitHub repository

System Requirements

To compile this project you need CMake (>3.21), a GCC and G++ (>10.0), Java JRE (>1.6), and Z3 (v4.8.7). Compilation with CLAN (>12.0) is also possible. To run the benchmarking script, you need Pyton 3 (>3.5).

Z3 Version

For reproducing the results, it is paramount to use the correct Z3 version: v4.8.7. Versions >4.8.7 generally fail to run z3::solver::consequences(). If z3::solver::consequences() fails, plankton issues a warning and uses a backup procedure relying on z3::solver::check(). The backup procedure tremendously decreases performance, hence increases the runtime.

If you compile Z3 yourself, make sure to use GCC/G++ and not CLANG. We recommend to compile with GCC/G++ version 7.

Compilation

To build the project, do the following:


        cd path/to/plankton
        mkdir build
        cd build
        CXX=/path/to/CXX-compiler cmake .. -DCMAKE_INSTALL_PREFIX=/path/to/installation
        CXX=/path/to/CXX-compiler make
        make install
			

This creates a folder /path/to/installation/plankton at the specified location. This folder contains all binaries, scripts, and examples needed to reproduce the evaluation from the paper.

We recommend to use the -DCMAKE_INSTALL_PREFIX to specify a location where to install the tool. Omitting this flag will install the tool somewhere like /usr/local/. You can customize the folder that is appended to /path/to/installation by setting -DINSTALL_FOLDER appropriately in the cmake invocation.

You can omit the -DCMAKE_CXX_COMPILER option to use your system's default compiler.

Troubleshooting

There are some known issues you might run into.

Issue: missing external dependencies
The project includes the code for AntLR, however, your system might miss some dependencies for it to build successfully. On Linux (tested with Ubuntu >18.04), make sure you have uuid, uuid-dev, and pkg-config installed.
Issue: timeout while cloning from GitHub
In some versions, the build process of AntLR fetches the library utfcpp from GitHub. In some (mainly academic) network setups this may fail and time out. The problem can be solved by forcing git to use https like so:

        git config --global url."https://".insteadOf git://
			
Issue: clang and libc++
When building with clang, you might run into problems compiling and linking AntLR. To solve this (tested with Ubuntu >18.04), install libc++-dev and libc++abi-dev. Then, create a compiler wrapper stdclang like so

        sudo touch /usr/bin/stdclang
        sudo chmod a+x /usr/bin/stdclang
			
Edit the created file /usr/bin/stdclang to contain this:

        #!/bin/bash
        clang++ -stdlib=libc++ $@
			
With this, use stdclang as the CXX compiler in the above build commands; it properly sets up clang++ to use libc++.

Docker

The repository comes with a Dockerfile that creates a Docker container with the necessary dependencies and an installation. To use it, you can something like this:


        cd path/to/plankton
        docker build . -t plankton
        docker run -it plankton
			

The Docker container contains two versions of plankton, a release version at /artifact/plankton/ and a debug version with additional output during proof generation at /artifact/plankton-debug/.


Benchmarking

We describe how to repeat the benchmarks from the paper for verifying lock-free set data structures.

Benchmark Programs

We evaluated our tool on the following implementation:

Running Benchmarks

To run the benchmarks go to the installation folder and execute the benchmark script like so:


        cd path/to/installation
        python3 benchmark.py
			

Make sure to use python3. Our test script uses the timeout feature of the subprocess library which is available only in version 3.5 and newer.

There is a known issue where the the benchmark.py scrip will report failure, however, running plankton directly in the shell results in successful verification. The issue is due to Z3 spontaneously misbehaving when used with Python's subprocess library.

The benchmark.py script runs each benchmark for 5 times, with a timeout of 6 hours per run. To modify the repetition count or the timeout, pass the appropriate command line arguments as shown below. You may also abort benchmark.py (ctrl+c) which stops the benchmarking and produces the average of the benchmarks covered so far (the benchmarks are run in a round robin fashion so that preliminary results are more likely to cover the entire benchmark suite).


        python3 benchmark.py [<number_of_repetitions>] [<timeout_in_seconds>]
			

The benchmark script will output a table with the following columns:

Expected Benchmarks Results

We repeat our evaluation. We used an Apple M1 Pro with 16GB of RAM. We mark successful tasks with , failed tasks with , and timed out tasks with .

Runtimes on traditional x86/amd64 hardware will be slower than the Apple Silicone runtimes reported here. However, we expect traditional hardware to produce the same tendencies and most importantly the same verification verdict.
Program (on Apple M1) Iter Eff Cand Com Fut Hist Join Inter Linearizability
Fine-Grained set 2 5 2 11% 15% 43% 15% 8% 46s
Lazy set 2 6 2 10% 13% 54% 11% 5% 77s
FEMRS tree (no maintenance) 2 5 2 19% 0% 49% 1% 9% 130s
Vechev&Yahav's 2CAS set 2 3 1 14% 0% 33% 31% 9% 125s
Vechev&Yahav's CAS set 2 4 1 15% 7% 39% 23% 6% 54s
ORVYY's set 2 3 0 17% 0% 40% 26% 6% 47s
Michael's set 2 4 2 11% 29% 30% 15% 6% 306s
Michael's set (wait-free) 2 4 2 11% 28% 30% 15% 6% 246s
Harris' set 2 4 2 7% 8% 19% 32% 4% 1378s
Harris' set (wait-free) 2 4 2 8% 10% 17% 34% 3% 1066s

Stress Tests

We also stress tested plankton by running versions of our benchmark programs with purposely inserted bugs. As expected, verification fails in all those cases. Below is a list of inserted bugs. If applicable, we inserted the bug into all the benchmark programs. Note that plankton does not implement error explanation techniques. Hence, the reason for verification failure reported by plankton may not reflect nor hint on the actual bug.

The stress test suite can be repeated with the stresstest.py script. This script is similar in usage to the stresstest.py script.


        python3 stresstest.py [<number_of_repetitions>] [<timeout_in_seconds>]
			

Found bugs are indicated by . If verification succeeds for a buggy test case, is printed.


The plankton Tool

Instead of using the benchmark script, you can directly interact with plankton.

Basic Usage

You invoke plankton like so:


        ./plankton  [-g] [--no-spurious]  <path_to_program>
			

The flags and their arguments are listed in the following table (excerpt).

Flag / Argument Optional Default Description
<path_to_program> no Input program file to analyze. For examples, See Benchmark Programs
--gist
-g
yes false Outputs a machine readable (read: easy to parse) string at the end of execution, informing about the overall runtime and linearizability verdict.
--no-spurious yes false Instructs the internal rewrite engine to prevent desugared CAS commands to fail spuriously. Note that the C++ standard explicitly allows CAS to fail spuriously.
--loopNoPostJoin yes false Instructs proof construction to perform a join over the postannotations of loops. This increases the precision of proof.

Besides the above, --version prints the tool's version, -h and --help print the full usage description.

Note that --loopNoPostJoin flag is required in order for the FEMRS tree to be verified successfully. The remaining benchmarks do not require the increased precision. The benchmark.py script passes this flag automatically.

Input Programs

plankton supports a simple C-style input language. Below you find an example program that showcases the required components and the most common constructs. (Note that the below example may not be linearizable.) For a list of all supported constructs, refer to the ANTLR parser definition  here.


Notes on the Implementation

The following table links the techniques presented in the paper to the implementation.

Paper Implementation Description
Programs, Logics include/programs/ast.hpp
include/logics/ast.hpp
Abstract syntax tree for programs and the logics.
Flows, Invariants include/engine/config.hpp Configuration object that specifies the flow and the node invariants. Configuration objects are taken (parsed) from the input program. Currently, the flow is restricted to keyset flows.
Program Simplification src/parser/* The parser supports a richer input language that is transformed into the one defined in include/programs/ast.hpp.
Commands src/engine/solver/post_*.cpp Implementation of Rule (COM-SEM). There are specialized implementations for assignments to variables (post_assign_var.cpp), assignments to memory cells (post_assign_write.cpp), assumptions (post_assume.cpp), and allocation (post_malloc.cpp).
Loops src/engine/proof/stmt.cpp Implementation of Rule (LOOP).
Flow Graphs include/engine/flowgraph.hpp
src/engine/flowgraph/make.hpp
Definition and generation of flow graphs. Flow graphs are generated wrt. a memory assignment by taking a fixed, finite number of reachable nodes starting from the nodes participating in the assignment.
Histories src/engine/solver/past.cpp Implementation of history introduction and reasoning. Note that this implements the how. To see when this is performed, refer to src/proof/common.hpp.
Futures src/engine/solver/future.cpp Implementation of future introduction and reasoning. Note that this implements the how. To see when this is performed, refer to src/proof/common.hpp.
Interference src/engine/solver/stability.cpp Applies interference (i.e. effects on the shared heap) such that annotations become stable (i.e. interference-free). Note that this implements the how. To see when this is performed, refer to src/proof/common.hpp.
Join src/engine/solver/join.cpp Implementation of the join operation, which deals with now, past, and future predicates. Resource-like entities (variable bindings, points-to predicates, proof obligation resources) are matched manually; the remaining entities are handled semantically by encoding them into SMT.
Entailment src/engine/solver/implication.cpp Entailment operation, which deals with now, past, and future predicates. Resource-like entities (variable bindings, points-to predicates, proof obligation resources), futures, pasts are matched syntactically; only the now stack is handled semantically by encoding it into SMT.
Encoding include/engine/encoding.hpp
src/engine/encoding/*
SMT encoding of now predicates. The encoding does not handle resource-like entities (variable bindings, points-to predicates, proof obligation resources).
Fixed Point src/engine/proof/api.cpp
src/engine/proof/common.cpp
Implements the fixed point (src/engine/proof/api.cpp) saturating the interference by repeatedly generating proofs for the API functions of the input program and extracting new inferferences from the generated proof. The generated proof is guaranteed to be interference-free: interferences are applied to the generated annotation unless a command is a right-mover (src/engine/proof/common.cpp). Past and future reasoning is applied whenever interference is applied or as a pre-processing step of the join operation (src/engine/proof/common.cpp).
Linearizability src/engine/proof/api.cpp
src/engine/solver/post_assign_write.cpp
src/engine/solver/ful.cpp
src/engine/encoding/spec.cpp
Linearizability is checked for every proof generated for API function (src/engine/proof/api.cpp). The check is performed directly for impure memory assignment (src/engine/solver/post_assign_write.cpp) or using past (hindsight) reasoning (src/engine/solver/fut.cpp) on top of a function's proof (src/engine/proof/api.cpp). Whether or not the specification is met, is checked by encoding the keyset theorem in SMT (src/engine/encoding/spec.cpp).

References

FeldmanEMRS18 Yotam M. Y. Feldman, Constantin Enea, Adam Morrison, Noam Rinetzky, and Sharon Shoham. 2018. Order out of Chaos: Proving Linearizability Using Local Views. In DISC (LIPIcs, Vol. 121). Springer. https://doi.org/10.4230/LIPIcs.DISC.2018.23
Harris01 Timothy L. Harris. 2001. A Pragmatic Implementation of Non-blocking Linked-Lists. In DISC (LNCS, Vol. 2180). Springer. https://doi.org/10.1007/3-540-45414-4_21
HellerHLMSS05 Steve Heller, Maurice Herlihy, Victor Luchangco, Mark Moir, William N. Scherer III, and Nir Shavit. 2005. A Lazy Concurrent List-Based Set Algorithm. In OPODIS (LNCS, Vol. 3974). Springer. https://doi.org/10.1007/11795490_3
Michael02a Maged M. Michael. 2002. High performance dynamic lock-free hash tables and list-based sets. In SPAA.. https://doi.org/10.1145/564870.564881
OHearnRVYY10 Peter W. O’Hearn, Noam Rinetzky, Martin T. Vechev, Eran Yahav, and Greta Yorsh. 2010. Verifying linearizability with hindsight. In PODC. ACM. https://doi.org/10.1145/1835698.1835722
HerlihyS08 Maurice Herlihy and Nir Shavit. 2008. The art of multiprocessor programming (Chapter 9.5). Morgan Kaufmann. https://isbnsearch.org/isbn/9780123705914
VechevY08 Martin T. Vechev and Eran Yahav. 2008. Deriving linearizable fine-grained concurrent objects. In PLDI. ACM. https://doi.org/10.1145/1375581.1375598