Abstract.
Verification of concurrent data structures is one of the most challenging tasks in software verification.
The topic has received considerable attention over the course of the last decade.
Nevertheless, human-driven techniques remain cumbersome and notoriously difficult while automated approaches suffer from limited applicability.
The main obstacle for automation is the complexity of concurrent data structures.
This is particularly true in the absence of garbage collection.
The intricacy of lock-free memory management paired with the complexity of concurrent data structures makes automated verification prohibitive.
In this work we present a method for verifying concurrent data structures and their memory management separately.
We suggest two simpler verification tasks that
imply the correctness of the data structure.
The first task establishes an over-approximation of the reclamation behavior of the memory management.
The second task exploits this over-approximation to verify the data structure without the need to consider the implementation of the memory management itself.
To make the resulting verification tasks tractable for automated techniques, we establish a second result.
We show that a verification tool needs to consider only executions where a single memory location is reused.
We implemented our approach and were able to verify linearizability of Michael&Scott's queue and the DGLM queue for both hazard pointers and epoch-based reclamation.
To the best of our knowledge, we are the first to verify such implementations fully automatically.
CCS Concepts: • Theory of computation → Data structures design and analysis; Program verification; Shared memory algorithms; Program specifications; Program analysis;
Additional Key Words and Phrases: static analysis, lock-free data structures, verification, linearizability, safe memory reclamation, memory management
ACM Reference Format:
Roland Meyer and Sebastian Wolff. 2019. Decoupling Lock-Free Data Structures from Memory Reclamation for Static Analysis. Proc. ACM Program. Lang. 3, POPL, Article 58 (January 2019), 31 pages. https://doi.org/10.1145/3290371
Our paper presents an analysis for verifying lock-free data structures that use safe memory reclamation (SMR) techniques. We follow a compositional approach which factors out the verification of the SMR implementation. Consequently, we have two distinct tools for:
The SMR specification is given in form of an observer automata (hard coded into our tools). For data structures, we verify liniearizability (we assume that programs are annotated with linearization points).
We submitted our tool to the POPL'19 artifact evaluation where it received the Artifact Evaluated -- Reusable badge.
In the this page we discuss how to reproduce our experiments. Therefore, we walk through the installation, compilation, and benchmarking. In the end, we briefly link the concepts and techniques from the paper to our implementation.
Our tools do not come with a front end (aka parser). To analyse programs not included in our benchmarks, one has to compile an appropriate binary first. If desired, consider the provided benchmark implementations. For example, consider:
TMRexp-POPL19/test/HP/MichaelScottQueue.cpp
, orTMRexp-POPL19/test/HP/Stack.cpp
.TMRexp-POPL19/src/prog.hpp
, lines 655f).
This section describes how to setup your system to repeat the benchmarks from the paper.
~/Desktop/pw.txt
).
The artifact is located at ~/popl19artifact/
.
The folder ~/popl19artifact/Binaries
contains executables corresponding to the benchmarks from the paper.
If you choose this option you can skip the remainder of this section.
The source code can be downloaded from GitHub here:
For a unified description of the installation/compilation for both tools, we assume that unzipping yields a folder named TMRexp-POPL19
.
To compile this project you need CMake, a C++17 compatible compiler, and Boost multi_array. Compilation was tested with Apple LLVM v.9.1.0, Apple LLVM v.10.0.0, Clang v.5.0, and Clang v.6.0.
To build the project, simply type make in the root directory of the downloaded sources.
cd path/to/TMRexp-POPL19
make
This creates a folder build/bin
containing all executables.
Alternatively, you can do the following (for example, to specify the compiler to be used):
mkdir build
cd build
cmake .. -DCMAKE_CXX_COMPILER=/path/to/CXX-compiler
make
Again, folder build/bin
contains all executables.
We describe how to repeat the benchmarks for verifying lock-free data structures relative to an SMR implementation from the paper.
~/popl19artifact/Binaries/<Program>_<SMR>
to run the benchmark for Program
using SMR
.
When compiling on your own, execute TMRexp-POPL19/build/bin/<Program>_<SMR>
(for the DGLM Queue see the discussion below).
We evaluated our tool on a Coarse Stack, a Coarse Queue, Treiber's Stack, Michael&Scott's Queue, and the DGLM queue using no SMR (N
), No Reclamation/Garbage Collection (G
), Hazard Pointers (H
), and Epoch-Based Reclamation (E
).
C-style pseudo code is given in the following.
We mark lines that are required only for a specific SMR by the aforementioned letters N
, G
, H
, and E
.
To run the benchmarks got to the build/bin/
folder and execute the corresponding binary.
The naming scheme is <DS>_<SMR>
where <DS>
is the name of the benchmark program and <SMR>
is the reclamation scheme.
For example, to benchmark the running example from the paper, Michael&Scott's Queue using Hazard Pointers (HP), you have to do the following:
cd path/to/TMRexp-POPL19/build/bin
./MichaelScottQueue_HP
As stated in the paper, the implemented shape abstraction requires hints to verify the DGLM Queue.
To enable/disable hinting you have to set the corresponding flag in src/config.hpp
and recompile the tool.
More precisely, you have to change line 22 from
#define DGLM_HINT false
to
#define DGLM_HINT true
Beware: using DGLM_HINT=true
with any benchmark other than the DGLM Queue exhibits undefined behavior!
For users of the VM image: the folder ~/popl19artifact/Binaries/
already contains the properly compiled binaries.
The binaries where compiled from ~/popl19artifact/chkds/
and ~/popl19artifact/chkds-DGLM/
for all benchmarks except the DGLM Queue and the DGLM Queue, respectively.
As reported in the paper, running the benchmarks on an Intel Xeon X5650@2.67GHz with Ubuntu 16.04 and using Clang version 5.0 we expect the following running times and verification results.
Program | SMR | Time Verif. | States | ABAs | Time ABA | Linearizable |
---|---|---|---|---|---|---|
Coarse Stack | GC | 0.44s | 300 | 0 | 0s | yes |
None | 0.5s | 300 | 0 | 0s | yes | |
Coarse Queue | GC | 1.7s | 300 | 0 | 0s | yes |
None | 1.7s | 300 | 0 | 0s | yes | |
Treiber's Stack | GC | 3.2s | 806 | 0 | 0s | yes |
EBR | 16s | 1822 | 0 | 0s | yes | |
HP | 19s | 2606 | 186 | 0.06s | yes | |
Opt. Treiber's Stack | HP | 0.8s | - | - | - | no* |
Michael&Scott's Queue | GC | 414s | 2202 | 0 | 0s | yes |
EBR | 2630s | 7613 | 0 | 0s | yes | |
HP | 7075s | 19028 | 536 | 0.9s | yes | |
DGLM Queue | GC | 714s | 9934 | 0 | 0s | yes** |
EBR | 3754s | 27132 | 0 | 0s | yes** | |
HP | 7010s | 41753 | 2824 | 26s | yes** |
* Pointer Race due to an ABA in push
makes the verification fail.
** Imprecision in the heap abstraction required hinting that Head
cannot overtake Tail
by more than one node.
We describe how to repeat the benchmarks for verifying SMR implementations from the paper.
~/popl19artifact/Binaries/<SMR>
to run the benchmark for the SMR implementation SMR
.
When compiling on your own, execute TMRexp-POPL19/build/bin/<SMR>
.
We evaluated our tool on an Hazard Pointer (HP) and an Epoch-based Reclamation (EBR) implementation both of which support dynamic joining/parting of threads. C-style pseudo code is given in the following.
To run the benchmarks got to the build/bin/
folder and execute the corresponding binary.
The naming scheme is <SMR>_<Info>
where <SMR>
is the (abbreviated) name of the benchmark program and <Info>
is a description of the implementation (we have various implementations for each class).
For example, to benchmark the running example from the paper, the Hazard Pointer implementation, you have to do the following:
cd path/to/TMRexp-POPL19/build/bin
./HPimpl
As reported in the paper, running the benchmarks on an Intel Xeon X5650@2.67GHz with Ubuntu 16.04 and using Clang version 5.0 we expect the following running times and verification results.
Program | Observer | Time | States | Correct |
---|---|---|---|---|
Hazard Pointers | OBase x OHP | 1.5s | 5437 | yes |
Epoch-based Reclamation | OBase x OEBR | 11.2s | 11528 | yes |
When executing a binary from above, our tool prints the following:
For an example, the output of chsmr HPimpl
has the following form:
post image... [#enc] done! [#enc=0.9k, #step=0k, #steptotal=0k]
interference... [32 buckets][#bucketsize][198][129][130][129][4][59][59][59][105][105][70][70][4][59][59][59][59][59][17][17][105][105][59][59][101][37][70][70][17][17][37][67] done! [#enc=2.2k, #step=85k, #steptotal=40k]
post image... [#enc][2k][3k][4k][4k] done! [#enc=5.0k, #step=4k, #steptotal=5k]
interference... [32 buckets][#bucketsize][274][253][255][253][79][156][156][156][223][223][157][159][79][156][156][156][154][154][93][95][223][223][154][154][217][107][157][157][93][93][105][153] done! [#enc=5.2k, #step=172k, #steptotal=96k]
post image... [#enc] done! [#enc=5.4k, #step=0k, #steptotal=5k]
interference... [32 buckets][#bucketsize][274][253][255][253][156][156][156][156][223][223][159][159][156][156][156][156][154][154][95][95][223][223][154][154][217][135][157][157][93][93][133][153] done! [#enc=5.4k, #step=173k, #steptotal=151k]
post image... [#enc] done! [#enc=5.4k, #step=0k, #steptotal=5k]
interference... [32 buckets][#bucketsize][274][253][255][253][156][156][156][156][223][223][159][159][156][156][156][156][154][154][95][95][223][223][154][154][217][135][157][157][93][93][133][153] done! [#enc=5.4k, #step=173k, #steptotal=207k]
The output tells you that sequential steps (post image...
) and interference steps (interference...
) are executed in rounds.
Sequential steps print the current size of the fixed point in regular intervals.
For example, in the third line we have [#enc][2k]
meaning that there are 2000 views at the time of the output.
Interference steps print the number of buckets.
The explored views are stored in buckets to reduce the search space of interference steps.
Views from different buckets are guaranteed to not match, i.e., are guaranteed to not require interference.
In the second line, for example, we have [32 buckets][#bucketsize][274]
.
This states that there are 32 different buckets and that the first bucket contains 274 views.
The remaining output is straight forward.
The following tables relate (informal) descriptions of the implemented analysis from the paper to the implementation.
Paper | Implementation | Description |
---|---|---|
Views | src/cfg.hpp |
Views are implemented by the class Cfg .
They carry, among other things, a shape abstraction of the memory, validity information, and observer states.
|
Thread-modular fixed point | src/fixpoint.cpp |
The fixed point computation maintains a collection of views the post image of which may not have been explored.
The post image of these views is computed and added to the fixed point.
The implementation proceeds in rounds.
In each round sequential steps are performed exhaustively first and then interference steps are applied.
(There is an alternative fixed point computation using a work list approach. It can be activated by setting the WORKLIST_INTERFERENCE to true in the configuration file.)
|
Shape abstraction | src/shape.hpp |
Class Shape implements a container for a shape abstraction of a specific memory layout.
More details on the shape abstraction can be found in Abdulla et al. [2013,2017].
|
src/post/*.cpp |
Contains post-image computation for the shape abstraction for every supported statement type. | |
Interference | src/fixp/interference.cpp |
Implements interference steps. Such a step considers all pairs of previously explored views, combines them into a two-thread view, computes a post image for one of the threads, and projects the result to the other thread. The resulting views are added to the fixed point. |
Observers | src/observer.hpp |
Class Observer implements observer automata as presented in the paper.
|
test/ObserverFactory.hpp |
Creates observer instances along the lines of the paper. |
Paper | Implementation | Description |
---|---|---|
Views | src/cfg.hpp |
Besides the above, also stores freedness information of the single reusable cell. |
Elision support | test/ObserverFactory.hpp |
The implementation relies on Lemma 6.1, thus does not check if the observer supports elision (Definition 5.14). |
Environment free |
src/post/mem.cpp |
Following the semantics for environment frees, rule (Free), the implementation checks which frees can be executed by checking whether the observer reaches a final state upon consuming a free event. |
Reallocation of a single address | src/post/mem.cpp |
To identify the single address available for reuse, shapes carry a special ghost pointer to the reusable cell. Moreover, views carry a flag stating whether or not the cell is freed. If so, an allocation can yield a pointer to that cell. Otherwise, the allocation can only yield fresh cells. |
Validity | src/cfg.hpp |
Views (class Cfg ) carry validity information about pointers.
|
src/post/*.cpp |
The post image operator updates the validity according to the definition in the paper. | |
Pointer race check | src/post.cpp |
The post image operators check whether an unsafe access or a racy SMR call is performed.
The latter relies on Lemma 6.2 and deems racy calls to retire that use an invalid pointer.
|
Harmful ABA check | src/chkaware.cpp |
Checks for every view where an invalid pointer can be compared in an assert for harmful ABAs.
The implementation follows the description in the paper.
|
Paper | Implementation | Description |
---|---|---|
Views | src/cfg.hpp |
Besides the above, also stores data (set) abstractions. |
Retain records | src/shape.hpp |
Stores the myRecord pointer of a non-view thread.
|
Last retire |
test/ObserverFactory.hpp |
Introduces redundant observer locations to judge whether or not the observed thread performed the last retire call.
|
src/fixp/interference.cpp |
Prunes interference among views for identical threads. | |
Data (set) abstraction | src/post/*.cpp |
Contains post-image computation for the data abstraction for every supported statement type. |