A Concurrent Program Logic with a Future and History

Roland Meyer, TU Braunschweig, Germany
Thomas Wies, New York University, USA
Sebastian Wolff, New York University, USA

Accepted for OOPSLA'22.

Abstract. Verifying fine-grained optimistic concurrent programs remains an open problem. Modern program logics provide abstraction mechanisms and compositional reasoning principles to deal with the inherent complexity. However, their use is mostly confined to pencil-and-paper or mechanized proofs. We devise a new separation logic geared towards the lacking automation. While local reasoning is known to be crucial for automation, we are the first to show how to retain this locality for (i) reasoning about inductive properties without the need for ghost code, and (ii) reasoning about computation histories in hindsight. We implemented our new logic in a tool and used it to automatically verify challenging concurrent search structures that require inductive properties and hindsight reasoning, such as the Harris set.

Paper Tech Report DOI Source Code

Quick Start

Install Docker.

Get the thing running in a Docker container:

git clone --branch OOPSLA22 https://github.com/Wolff09/plankton.git plankton \
  && cd plankton \
  && docker build . -t plankton \
  && docker run -it plankton

See Compilation and Installation

Run the benchmarks in the Docker container: python3 benchmark.py
See Benchmarking
Presto!

Tool Overview

plankton is an experimental verifier for lock-free data structures. It verifies linearizability of input programs by constructing a Hoare-style proof in a novel program logics. Crucially, this program logic supports (i) reasoning about inductive properties and (ii) hindsight reasoning. Both techniques are performed in lock step with the program, avoiding the need for ghost code. To localize the proofs, the program logic incorporates (iii) flow reasoning.

plankton takes as input the program under scrutiny and a candidate node invariant. It then fully automatically generates a proof within our novel program logic, establishing that the given program is linearizable and does adhere to the given invariant.

A discussion on how to reproduce our experiments follows. Therefore, we walk through the installation, compilation, and benchmarking. In the end, we briefly link the concepts and techniques from the paper to our implementation.

Compilation and Installation

This section describes how to setup your system to repeat the benchmarks from the paper.

Download

The source code can be downloaded from GitHub: GitHub release with tag OOPSLA22

Alternatively, you may wish to clone/download a potentially different version from GitHub: GitHub repository

System Requirements

To compile this project you need CMake (>3.21), a GCC and G++ (>10.0), Java JRE (>1.6), and Z3 (v4.8.7). Compilation with CLAN (>12.0) is also possible. To run the benchmarking script, you need Pyton 3 (>3.5).

Z3 Version

For reproducing the results, it is paramount to use the correct Z3 version: v4.8.7. Versions >4.8.7 generally fail to run z3::solver::consequences(). If z3::solver::consequences() fails, plankton issues a warning and uses a backup procedure relying on z3::solver::check(). The backup procedure tremendously decreases performance, hence increases the runtime.

If you compile Z3 yourself, make sure to use GCC/G++ and not CLANG. We recommend to compile with GCC/G++ version 7.

Compilation

To build the project, do the following:


        cd path/to/plankton
        mkdir build
        cd build
        CXX=/path/to/CXX-compiler cmake .. -DCMAKE_INSTALL_PREFIX=/path/to/installation
        CXX=/path/to/CXX-compiler make
        make install

This creates a folder /path/to/installation/plankton at the specified location. This folder contains all binaries, scripts, and examples needed to reproduce the evaluation from the paper.

We recommend to use the -DCMAKE_INSTALL_PREFIX to specify a location where to install the tool. Omitting this flag will install the tool somewhere like /usr/local/. You can customize the folder that is appended to /path/to/installation by setting -DINSTALL_FOLDER appropriately in the cmake invocation.

You can omit the -DCMAKE_CXX_COMPILER option to use your system's default compiler.

Troubleshooting

There are some known issues you might run into.

Issue: missing external dependencies

The project includes the code for AntLR, however, your system might miss some dependencies for it to build successfully. On Linux (tested with Ubuntu >18.04), make sure you have uuid, uuid-dev, and pkg-config installed.

Issue: timeout while cloning from GitHub

In some versions, the build process of AntLR fetches the library utfcpp from GitHub. In some (mainly academic) network setups this may fail and time out. The problem can be solved by forcing git to use https like so:


        git config --global url."https://".insteadOf git://

Issue: clang and libc++

When building with clang, you might run into problems compiling and linking AntLR. To solve this (tested with Ubuntu >18.04), install libc++-dev and libc++abi-dev. Then, create a compiler wrapper stdclang like so


        sudo touch /usr/bin/stdclang
        sudo chmod a+x /usr/bin/stdclang

Edit the created file /usr/bin/stdclang to contain this:


        #!/bin/bash
        clang++ -stdlib=libc++ $@

With this, use stdclang as the CXX compiler in the above build commands; it properly sets up clang++ to use libc++.

Docker

The repository comes with a Dockerfile that creates a Docker container with the necessary dependencies and an installation. To use it, you can something like this:


        cd path/to/plankton
        docker build . -t plankton
        docker run -it plankton

The Docker container contains two versions of plankton, a release version at /artifact/plankton/ and a debug version with additional output during proof generation at /artifact/plankton-debug/.

Benchmarking

We describe how to repeat the benchmarks from the paper for verifying lock-free set data structures.

Benchmark Programs

We evaluated our tool on the following implementation:

Fine-grained set HerlihyS08 Code
Lazy set HellerHLMSS05 Code
FEMRS tree (no maintenance) FeldmanEMRS18 Code
Vechev&Yahav's CAS set VechevY08 Code
ORVYY's set OHearnRVYY10 Code
Michael's set Michael02a Code
Harris' set Harris01 Code
Michael's set (with wait-free search) Michael02a Code
Harris' set (with wait-free search) Harris01 Code

Running Benchmarks

To run the benchmarks go to the installation folder and execute the benchmark script like so:


        cd path/to/installation
        python3 benchmark.py

Make sure to use python3. Our test script uses the timeout feature of the subprocess library which is available only in version 3.5 and newer.

There is a known issue where the the benchmark.py scrip will report failure, however, running plankton directly in the shell results in successful verification. The issue is due to Z3 spontaneously misbehaving when used with Python's subprocess library.

The benchmark.py script runs each benchmark for 5 times, with a timeout of 6 hours per run. To modify the repetition count or the timeout, pass the appropriate command line arguments as shown below. You may also abort benchmark.py (ctrl+c) which stops the benchmarking and produces the average of the benchmarks covered so far (the benchmarks are run in a round robin fashion so that preliminary results are more likely to cover the entire benchmark suite).


        python3 benchmark.py [<number_of_repetitions>] [<timeout_in_seconds>]

The benchmark script will output a table with the following columns:

Program: the name of the example program,
Iter: number of iterations until interferences (effects) are saturated,
Eff: number of interference effects,
Cand: number of future candidates,
Com: percentage of overall runtime spent for handling atomic commands,
Fut: percentage of overall runtime spent for future reasoning,
Hist: percentage of overall runtime spent for history reasoning,
Join: percentage of overall runtime spent for the join operation,
Inter: percentage of overall runtime spent for applying interferences, and
Linearizability: linearizability verdict and time in seconds to generate it (i.e. a proof thereof).

Expected Benchmarks Results

We repeat our evaluation. We used an Apple M1 Pro with 16GB of RAM. We mark successful tasks with , failed tasks with , and timed out tasks with .

Runtimes on traditional x86/amd64 hardware will be slower than the Apple Silicone runtimes reported here. However, we expect traditional hardware to produce the same tendencies and most importantly the same verification verdict.

Program (on `Apple M1`)	Iter	Eff	Cand	Com	Fut	Hist	Join	Inter	Linearizability
Fine-Grained set	2	5	2	11%	15%	43%	15%	8%	46s
Lazy set	2	6	2	10%	13%	54%	11%	5%	77s
FEMRS tree (no maintenance)	2	5	2	19%	0%	49%	1%	9%	130s
Vechev&Yahav's 2CAS set	2	3	1	14%	0%	33%	31%	9%	125s
Vechev&Yahav's CAS set	2	4	1	15%	7%	39%	23%	6%	54s
ORVYY's set	2	3	0	17%	0%	40%	26%	6%	47s
Michael's set	2	4	2	11%	29%	30%	15%	6%	306s
Michael's set (wait-free)	2	4	2	11%	28%	30%	15%	6%	246s
Harris' set	2	4	2	7%	8%	19%	32%	4%	1378s
Harris' set (wait-free)	2	4	2	8%	10%	17%	34%	3%	1066s

Stress Tests

We also stress tested plankton by running versions of our benchmark programs with purposely inserted bugs. As expected, verification fails in all those cases. Below is a list of inserted bugs. If applicable, we inserted the bug into all the benchmark programs. Note that plankton does not implement error explanation techniques. Hence, the reason for verification failure reported by plankton may not reflect nor hint on the actual bug.

Bug 1: Missing NULL check during traversals.
Bug 2: New entries are not linked into the structure.
Bug 3: New entries create a cycle in the structure.
Bug 4: Missing initialization of mark bit (new entries are potentially marked).
Bug 5: Traversals return the wrong key.
Bug 6: Updates to the structure ignore the mark bit.
Bug 7: Insertions do not check if the to-be-inserted key is already present (potential duplication of keys).
Bug 8: Insertions logically delete the node preceding the insertion.
Bug 9: Missing logical deletion.

The stress test suite can be repeated with the stresstest.py script. This script is similar in usage to the stresstest.py script.


        python3 stresstest.py [<number_of_repetitions>] [<timeout_in_seconds>]

Found bugs are indicated by . If verification succeeds for a buggy test case, is printed.

The `plankton` Tool

Instead of using the benchmark script, you can directly interact with plankton.

Basic Usage

You invoke plankton like so:


        ./plankton  [-g] [--no-spurious]  <path_to_program>

The flags and their arguments are listed in the following table (excerpt).

Flag / Argument	Optional	Default	Description
`<path_to_program>`	no	—	Input program file to analyze. For examples, See Benchmark Programs
`--gist -g`	yes	`false`	Outputs a machine readable (read: easy to parse) string at the end of execution, informing about the overall runtime and linearizability verdict.
`--no-spurious`	yes	`false`	Instructs the internal rewrite engine to prevent desugared `CAS` commands to fail spuriously. Note that the C++ standard explicitly allows `CAS` to fail spuriously.
`--loopNoPostJoin`	yes	`false`	Instructs proof construction to perform a join over the postannotations of loops. This increases the precision of proof.

Besides the above, --version prints the tool's version, -h and --help print the full usage description.

Note that --loopNoPostJoin flag is required in order for the FEMRS tree to be verified successfully. The remaining benchmarks do not require the increased precision. The benchmark.py script passes this flag automatically.

Input Programs

plankton supports a simple C-style input language. Below you find an example program that showcases the required components and the most common constructs. (Note that the below example may not be linearizable.) For a list of all supported constructs, refer to the ANTLR parser definition here.

The first part of a program declares some structures and shared variables, like Node resp. Head in the example below. Additionally, there is a dedicated initializer __init__ that is executed atomically once before execution begins. The #name option is optional.


        #name "Program Name"
        
        
        struct Node {
            data_t val; // 'int' and 'data_t' are synonyms
            bool marked;
            Node* next;
        }
        
        Node* Head;
        
        
        void __init__() {
            Head = malloc;
            Head->next = NULL;
            Head->marked = false;
            Head->val = MIN; // there is also 'MAX'
        }

First, @contains defines a predicate characterizing the logical contents of a node. This predicate takes a pointer to the node and a value, and returns true iff the given value is considered logically contained in the given node. Second, @outflow defines a predicate characterizing the outflow of a node per pointer field, here next. Third, @invariant defines the node invariants of the program. There is a dedicated invariant for nodes in the local and the shared heap.


        def @contains(Node* node, data_t key) {
            !node->marked && node->val == key
        }
        
        def @outflow[next](Node* node, data_t key) {
            !node->marked ==> node->val < key
        }
        
        def @invariant[local](Node* x) {
            x->_flow == 0
        }
        
        def @invariant[shared](Node* x) {
            Head != NULL
         && Head->val == MIN
         && !Head->marked
         && Head->_flow != 0
         && [MIN, MAX] in Head->_flow
         && x->val == MIN ==> x == Head
        
         && !x->marked ==> [x->val, MAX] in x->_flow
         && x->_flow != 0 ==> [x->val, MAX] in x->_flow
        }

Helper functions allow code reuse across multiple API functions. Helpers are defined with the keyword inline. They may have an arbitrary return type, including void, int, int, Node* (or pointers to any other declared struct), and comma separated lists of types.

For verification, one can think of helpers being inlined.


        inline <Node*, data_t> locate(data_t key) {
            Node* curr;
            data_t k;
            bool mark;
        
            curr = Head;
            do {
                curr = curr->next;
                k = curr->val;
                mark = curr->marked;
            } while (k < key || mark);
            return <curr, k>;
        }

API functions are targeted by the verification process. They must have return type bool and take a single data_t or int as argument. The allowed names are contains, add, insert, remove, and delete. plankton derives the specification of API functions from their name.


        bool contains(data_t key) {
            Node* curr;
            data_t k;
        
            <curr, k> = locate(key);
            return k == key;
        }
        
        bool add(data_t key) {
            entry = malloc;
            entry->val = key;
            entry->marked = false;
        
            while (true) {
                Node* entry, curr;
                data_t k;
                <curr, k> = locate(key);
        
                if (k == key) {
                    return false;
        
                } else {
                    atomic {
                        entry->next = curr->next;
                        curr->next = entry;
                    }
                    return true;
                }
            }
        }
        
        bool remove(data_t key) {
            while (true) {
                Node* curr;
                data_t k;
                <curr, k> = locate(key);
        
                next = curr->next;
                if (CAS(<curr->val, curr->marked>, <k, false>, <k, true>))
                    return true;
            }
        }

Notes on the Implementation

The following table links the techniques presented in the paper to the implementation.

Paper	Implementation	Description
Programs, Logics	`include/programs/ast.hpp` `include/logics/ast.hpp`	Abstract syntax tree for programs and the logics.
Flows, Invariants	`include/engine/config.hpp`	Configuration object that specifies the flow and the node invariants. Configuration objects are taken (parsed) from the input program. Currently, the flow is restricted to keyset flows.
Program Simplification	`src/parser/*`	The parser supports a richer input language that is transformed into the one defined in `include/programs/ast.hpp`.
Commands	`src/engine/solver/post_*.cpp`	Implementation of Rule `(COM-SEM)`. There are specialized implementations for assignments to variables (`post_assign_var.cpp`), assignments to memory cells (`post_assign_write.cpp`), assumptions (`post_assume.cpp`), and allocation (`post_malloc.cpp`).
Loops	`src/engine/proof/stmt.cpp`	Implementation of Rule `(LOOP)`.
Flow Graphs	`include/engine/flowgraph.hpp` `src/engine/flowgraph/make.hpp`	Definition and generation of flow graphs. Flow graphs are generated wrt. a memory assignment by taking a fixed, finite number of reachable nodes starting from the nodes participating in the assignment.
Histories	`src/engine/solver/past.cpp`	Implementation of history introduction and reasoning. Note that this implements the how. To see when this is performed, refer to `src/proof/common.hpp`.
Futures	`src/engine/solver/future.cpp`	Implementation of future introduction and reasoning. Note that this implements the how. To see when this is performed, refer to `src/proof/common.hpp`.
Interference	`src/engine/solver/stability.cpp`	Applies interference (i.e. effects on the shared heap) such that annotations become stable (i.e. interference-free). Note that this implements the how. To see when this is performed, refer to `src/proof/common.hpp`.
Join	`src/engine/solver/join.cpp`	Implementation of the join operation, which deals with now, past, and future predicates. Resource-like entities (variable bindings, points-to predicates, proof obligation resources) are matched manually; the remaining entities are handled semantically by encoding them into SMT.
Entailment	`src/engine/solver/implication.cpp`	Entailment operation, which deals with now, past, and future predicates. Resource-like entities (variable bindings, points-to predicates, proof obligation resources), futures, pasts are matched syntactically; only the now stack is handled semantically by encoding it into SMT.
Encoding	`include/engine/encoding.hpp` `src/engine/encoding/*`	SMT encoding of now predicates. The encoding does not handle resource-like entities (variable bindings, points-to predicates, proof obligation resources).
Fixed Point	`src/engine/proof/api.cpp` `src/engine/proof/common.cpp`	Implements the fixed point (`src/engine/proof/api.cpp`) saturating the interference by repeatedly generating proofs for the API functions of the input program and extracting new inferferences from the generated proof. The generated proof is guaranteed to be interference-free: interferences are applied to the generated annotation unless a command is a right-mover (`src/engine/proof/common.cpp`). Past and future reasoning is applied whenever interference is applied or as a pre-processing step of the join operation (`src/engine/proof/common.cpp`).
Linearizability	`src/engine/proof/api.cpp` `src/engine/solver/post_assign_write.cpp` `src/engine/solver/ful.cpp` `src/engine/encoding/spec.cpp`	Linearizability is checked for every proof generated for API function (`src/engine/proof/api.cpp`). The check is performed directly for impure memory assignment (`src/engine/solver/post_assign_write.cpp`) or using past (hindsight) reasoning (`src/engine/solver/fut.cpp`) on top of a function's proof (`src/engine/proof/api.cpp`). Whether or not the specification is met, is checked by encoding the keyset theorem in SMT (`src/engine/encoding/spec.cpp`).

References

FeldmanEMRS18 Yotam M. Y. Feldman, Constantin Enea, Adam Morrison, Noam Rinetzky, and Sharon Shoham. 2018. Order out of Chaos: Proving Linearizability Using Local Views. In DISC (LIPIcs, Vol. 121). Springer. https://doi.org/10.4230/LIPIcs.DISC.2018.23

Harris01 Timothy L. Harris. 2001. A Pragmatic Implementation of Non-blocking Linked-Lists. In DISC (LNCS, Vol. 2180). Springer. https://doi.org/10.1007/3-540-45414-4_21

HellerHLMSS05 Steve Heller, Maurice Herlihy, Victor Luchangco, Mark Moir, William N. Scherer III, and Nir Shavit. 2005. A Lazy Concurrent List-Based Set Algorithm. In OPODIS (LNCS, Vol. 3974). Springer. https://doi.org/10.1007/11795490_3

Michael02a Maged M. Michael. 2002. High performance dynamic lock-free hash tables and list-based sets. In SPAA.. https://doi.org/10.1145/564870.564881

OHearnRVYY10 Peter W. O’Hearn, Noam Rinetzky, Martin T. Vechev, Eran Yahav, and Greta Yorsh. 2010. Verifying linearizability with hindsight. In PODC. ACM. https://doi.org/10.1145/1835698.1835722

HerlihyS08 Maurice Herlihy and Nir Shavit. 2008. The art of multiprocessor programming (Chapter 9.5). Morgan Kaufmann. https://isbnsearch.org/isbn/9780123705914

VechevY08 Martin T. Vechev and Eran Yahav. 2008. Deriving linearizable fine-grained concurrent objects. In PLDI. ACM. https://doi.org/10.1145/1375581.1375598