Concurrency Analysis
Thread-aware analyses for concurrent programs.
Headers: include/Concurrency
Implementation: lib/Concurrency
Overview
The concurrency analysis module provides the shared analysis layer used by Lotus concurrency checking. It centers on thread creation, synchronization operations, memory accesses, and communication structure that can be recovered from LLVM IR.
The module supports multiple concurrency models:
POSIX threads (pthreads): standard pthread APIs
Modern C++ concurrency:
std::thread, mutexes, condition variables, futures/promises, atomics, and several C++20 primitives that are recognizedOpenMP: parallel regions, barriers, tasks, and lock operations
MPI: a separate SPMD communication analysis in
Concurrency/MPI/Custom APIs: configurable through
config/thread.specandThreadAPI
Key capabilities:
May-Happen-in-Parallel (MHP) analysis for shared-memory overlap reasoning
Lock-set analysis for race and deadlock checking
Happens-before computation from thread-flow and synchronizes-with edges
Escape and thread-sharing analyses for shared-memory filtering
Join-target reasoning for refining
pthread_joineffectsThread flow graph construction and thread API classification utilities
Directory layout
The source tree is grouped by subdirectory under include/Concurrency/ and
lib/Concurrency/:
Utils/:ThreadAPI,ThreadFlowGraph, vector-clock utilities, RAII lock tracking, and language models for C++, OpenMP, MPI, and Linux kernel APIsMHP/:MHPAnalysis,StaticVectorClockMHP, andHappensBeforeAnalysisLockSet/:LockSetAnalysisMemory/:EscapeAnalysisandStaticThreadSharingAnalysisJoinTarget/:JoinTargetAnalysisMPI/:MPIAnalysisand its process, collective, rank, and RMA analyses
Concurrency/MPI/ models MPI communication in the SPMD setting. It complements
the shared-memory analyses above, but it is not another thread library layered on
top of MHPAnalysis.
Main Components
MHPAnalysis
File: MHP/MHPAnalysis.cpp, MHP/MHPAnalysis.h
The core May-Happen-in-Parallel analysis determines which pairs of program statements may execute concurrently in a multithreaded program.
Analysis Process:
Thread Flow Graph Construction: Builds a graph representation of thread creation, synchronization operations, and their ordering relationships
Thread Region Analysis: Identifies regions of code that execute in different threads
Happens-Before Analysis: Computes happens-before relationships using fork-join semantics, locks, condition variables, and barriers
MHP Pair Computation: Precomputes pairs of instructions that may execute in parallel
Key Features:
Fork-join analysis for thread creation and termination
Lock-based synchronization analysis (optional integration with LockSetAnalysis)
Condition variable analysis (wait/signal/broadcast)
Barrier synchronization support
Multi-instance thread tracking (e.g., threads created in loops)
Efficient query interface with precomputed results
Query Interface:
mayHappenInParallel(I1, I2)– Check if two instructions may execute concurrentlymustBeSequential(I1, I2)– Check if two instructions must execute sequentiallygetParallelInstructions(inst)– Get all instructions that may run in parallelisPrecomputedMHP(I1, I2)– Check if a pair is in the precomputed MHP set
LockSetAnalysis
File: LockSet/LockSetAnalysis.cpp, LockSet/LockSetAnalysis.h
Computes the sets of locks that may or must be held at each program point. Essential for data race detection, deadlock detection, and precise MHP analysis.
Analysis Modes:
May-Lockset Analysis (over-approximation): - Computes the set of locks that may be held at each program point - Uses union operation at control flow merge points - Used for conservative data race detection
Must-Lockset Analysis (under-approximation): - Computes the set of locks that must be held at each program point - Uses intersection operation at control flow merge points - Used for proving absence of data races
Supported Operations:
pthread_mutex_lock/unlockpthread_rwlock_rdlock/wrlock/unlocksem_wait/postpthread_mutex_trylock(try-lock operations)Reentrant lock handling
Lock aliasing support
Features:
Intraprocedural lock set computation using dataflow analysis
Interprocedural lock set propagation across function calls
Lock ordering tracking for deadlock detection
Integration with alias analysis for precise lock identification
Query Interface:
getMayLockSetAt(inst)– Get may-lockset at an instructiongetMustLockSetAt(inst)– Get must-lockset at an instructionisLockHeldAt(lock, inst)– Check if a specific lock is held
JoinTargetAnalysis
File: JoinTarget/JoinTargetAnalysis.cpp,
JoinTarget/JoinTargetAnalysis.h
Computes which pthread_create sites may match a given pthread_join by
reasoning about the joined thread handle. This is used to refine thread
termination effects beyond a simple name-based match.
Use cases:
Refining join reasoning when multiple thread handles may alias
Supporting more precise MHP pruning around thread termination
Providing a dedicated analysis for the
JoinTarget/subdirectory that now exists in the source tree
ThreadAPI
File: Utils/ThreadAPI.cpp, Utils/ThreadAPI.h
Provides a unified interface for identifying and categorizing thread-related operations in LLVM IR. Acts as an abstraction layer over different threading models.
Supported Operations:
Thread Management: Fork, join, detach, exit, cancel
Synchronization: Lock acquire/release, try-lock
Condition Variables: Wait, signal, broadcast
Barriers: Init, wait
Mutex Management: Init, destroy
Threading Model Support:
POSIX threads: pthread_create, pthread_join, pthread_mutex_lock, etc.
C++11 threads: std::thread, std::mutex, std::condition_variable
OpenMP: Parallel regions, barriers, critical sections
Custom APIs: Configurable via configuration files
Key Methods:
isTDFork(inst)– Check if instruction creates a threadisTDJoin(inst)– Check if instruction joins a threadisTDAcquire(inst)– Check if instruction acquires a lockisTDRelease(inst)– Check if instruction releases a lockgetForkedThread(inst)– Get the pthread_t value for a forkgetForkedFun(inst)– Get the function executed by a thread
HappensBeforeAnalysis
File: MHP/HappensBeforeAnalysis.cpp, MHP/HappensBeforeAnalysis.h
Determines happens-before relationships between instructions using MHP analysis and synchronization operations. The happens-before relation is fundamental for reasoning about memory ordering and data races.
Happens-Before Sources:
Program Order: Sequential execution within a thread
Fork-Join: Thread creation and join edges
Locks: lock and unlock synchronization
Condition Variables: wait, signal, and broadcast are recognized as thread flow boundaries
Barriers and atomics: barriers, fences, and selected atomic patterns
Features:
Vector clock-based implementation (
FBVClock,BVClock)Efficient query interface
Integration with MHP analysis
Some synchronization constructs are recognized more precisely than they are
fully proved. For example, std::call_once, std::latch,
std::barrier, and condition-variable signaling are modeled conservatively,
and definite happens-before edges are still limited for some of them.
ThreadFlowGraph
File: Utils/ThreadFlowGraph.cpp, Utils/ThreadFlowGraph.h
Implements a graph representation of thread synchronization operations and their ordering relationships. Used as an intermediate representation for MHP and happens-before analyses.
Node Types:
Thread fork/join nodes
Lock acquire/release nodes
Condition variable nodes
Barrier nodes
Thread exit nodes
Features:
Graph construction from thread API calls
Edge representation of ordering relationships
Support for visualization and debugging
EscapeAnalysis
File: Memory/EscapeAnalysis.cpp, Memory/EscapeAnalysis.h
Determines which values escape their thread-local scope and become shared between threads. Essential for identifying which memory locations may be accessed by multiple threads.
Analysis Process:
Thread-Local Identification: Identifies values that are thread-local
Escape Detection: Tracks when values are passed to other threads
Shared Memory Identification: Marks memory locations as shared
Use Cases:
Data race detection (only shared memory can have races)
Memory safety analysis in concurrent contexts
Optimizing thread-local storage
StaticVectorClockMHP
File: MHP/StaticVectorClockMHP.cpp, MHP/StaticVectorClockMHP.h
Implements a static vector clock-based approach for MHP analysis. Uses vector clocks to efficiently track happens-before relationships and compute MHP pairs.
Features:
Fast bit-vector clock implementation (FBVClock)
Efficient MHP computation
Scalable to large programs
StaticThreadSharingAnalysis
File: Memory/StaticThreadSharingAnalysis.cpp, Memory/StaticThreadSharingAnalysis.h
Analyzes which memory locations are shared between threads using static analysis. Combines escape analysis with thread flow information to identify shared memory.
Features:
Static identification of shared memory
Integration with alias analysis
Support for complex pointer structures
Usage
Basic MHP Analysis:
#include <Concurrency/MHP/MHPAnalysis.h>
llvm::Module &M = ...;
mhp::MHPAnalysis mhp(M);
mhp.analyze();
const llvm::Instruction *I1 = ...;
const llvm::Instruction *I2 = ...;
if (mhp.mayHappenInParallel(I1, I2)) {
// I1 and I2 may execute concurrently
}
LockSet Analysis:
#include <Concurrency/LockSet/LockSetAnalysis.h>
llvm::Module &M = ...;
mhp::LockSetAnalysis lsa(M);
lsa.analyze();
const llvm::Instruction *inst = ...;
mhp::LockSet mayLocks = lsa.getMayLockSetAt(inst);
mhp::LockSet mustLocks = lsa.getMustLockSetAt(inst);
Combined Analysis for Data Race Detection:
#include <Concurrency/MHP/MHPAnalysis.h>
#include <Concurrency/LockSet/LockSetAnalysis.h>
#include <Concurrency/Memory/EscapeAnalysis.h>
llvm::Module &M = ...;
// Setup analyses
mhp::MHPAnalysis mhp(M);
mhp.enableLockSetAnalysis();
mhp.analyze();
mhp::EscapeAnalysis escape(M);
escape.analyze();
// Check for data races
for (auto &I1 : instructions) {
for (auto &I2 : instructions) {
if (mhp.mayHappenInParallel(I1, I2) &&
isMemoryAccess(I1) && isMemoryAccess(I2) &&
mayAlias(I1, I2) &&
(isWrite(I1) || isWrite(I2)) &&
!isProtectedByCommonLock(I1, I2, lsa) &&
escape.isShared(getMemoryLocation(I1))) {
// Potential data race detected
}
}
}
ThreadAPI Usage:
#include <Concurrency/Utils/ThreadAPI.h>
ThreadAPI *api = ThreadAPI::getThreadAPI();
for (auto &F : M) {
for (auto &I : instructions(F)) {
if (api->isTDFork(&I)) {
const llvm::Value *thread = api->getForkedThread(&I);
const llvm::Value *func = api->getForkedFun(&I);
// Process thread creation
}
}
}
Data race detection: when we report
The data race checker reports one bug per racy component (a connected set of conflicting accesses), not one per instruction pair. This reduces noise and matches the “one report per logical race” idea used in tools like Goblint.
When two accesses are considered a race
A potential data race is reported only when all of the following hold:
May happen in parallel (MHP) – The two instructions may execute concurrently.
At least one is a write – Read-only pairs are not races.
Not ordered by happens-before – e.g. C11 synchronizes-with; if one is guaranteed to happen before the other, no race.
May access the same location – Alias analysis says the two memory locations may overlap.
Not protected by a common lock – Lock-set analysis shows no lock is held for both.
Location is shared – Escape analysis says the memory may be shared (e.g. escaped, or global).
Same location
“Same location” means: the two instructions’ memory locations may alias and the
location is escaped (or otherwise shared). Accesses to thread-local, non-escaped
storage are skipped. Accesses to ignorable types (sync objects, FILE, atomics,
etc.) are never considered for races; see the checker’s ignorable-type list.
One report per component
Racy pairs are grouped into components: two accesses are in the same component if they are connected by a chain of “may race” pairs. The checker emits one report per component, with a representative pair and a step for each conflicting access. So multiple overlapping pairs on the same variable yield a single report.
Example
Two threads both write to shared_counter without a lock: the checker finds one
racy component (both writes), and reports one data race with two steps (one per
write). If a third thread also wrote to shared_counter, the same single report
would list all three accesses as steps.
Typical use cases:
Data race detection in multithreaded programs
Deadlock detection through lock ordering analysis
Verifying thread safety of concurrent data structures
Identifying potentially racy memory accesses
Providing concurrency-aware information to higher-level analyses
Security analysis of concurrent code
Limitations
Atomic Operations: Limited support for fine-grained memory ordering (std::memory_order_relaxed, acquire, release, etc.). The ThreadAPI maps atomics roughly to locks/barriers but doesn’t handle the fine-grained “synchronizes-with” relationships of weak atomics.
Dynamic Thread Creation: Analysis may be conservative for threads created in loops or conditionally, potentially leading to false positives in MHP analysis.
Interprocedural Precision: Some analyses (e.g., LockSetAnalysis) support interprocedural analysis, but precision may be limited for complex call patterns or indirect calls.
Alias Analysis Dependency: Accurate results require precise alias analysis. Imprecise alias analysis can lead to false positives in data race detection.
Language Model Coverage: While supporting pthreads, C++11, and OpenMP, some less common threading APIs may not be recognized without configuration.
Real-Time Constraints: The analysis does not model real-time scheduling constraints or priority-based execution ordering.