Developer Guide
This guide explains how to extend Lotus with custom analyses, checkers, and tools.
Development Environment Setup
Prerequisites
LLVM 14.0.0 development libraries
Z3 4.11 with headers
CMake 3.10+
C++14 compatible compiler (GCC 7+, Clang 5+)
Git for version control
Building from Source
git clone https://github.com/ZJU-PL/lotus
cd lotus
mkdir build && cd build
cmake ../ -DCMAKE_BUILD_TYPE=Debug
make -j$(nproc)
Debug build enables assertions and debugging symbols.
IDE Configuration
VSCode: Create .vscode/c_cpp_properties.json:
{
"configurations": [{
"name": "Linux",
"includePath": [
"${workspaceFolder}/include",
"${workspaceFolder}/build/include",
"/path/to/llvm/include",
"/usr/include/z3"
],
"compileCommands": "${workspaceFolder}/build/compile_commands.json"
}]
}
CLion: Open the project and point to build/compile_commands.json.
Code Organization
Directory Structure
lotus/
├── include/ # Public headers
│ ├── Alias/ # Alias analysis interfaces
│ ├── Analysis/ # Analysis utilities
│ ├── Apps/ # Application-level interfaces
│ ├── CFL/ # CFL reachability
│ ├── Dataflow/ # Data flow frameworks
│ ├── IR/ # Intermediate representations
│ ├── Solvers/ # Constraint solvers
│ ├── Transform/ # LLVM transformations
│ └── Utils/ # Utilities
├── lib/ # Implementation files
│ ├── Alias/ # Alias analysis implementations
│ ├── Analysis/ # Analysis implementations
│ ├── Apps/ # Applications (Clam, checkers)
│ ├── CFL/ # CFL implementations
│ ├── Dataflow/ # Data flow implementations
│ ├── IR/ # IR implementations
│ ├── Solvers/ # Solver implementations
│ ├── Transform/ # Transformation passes
│ └── Utils/ # Utility implementations
├── tools/ # Command-line tools
│ ├── alias/ # Alias analysis tools
│ ├── checker/ # Bug checking tools
│ ├── cfl/ # CFL tools
│ └── verifier/ # Verification tools
├── tests/ # Test cases
├── benchmarks/ # Benchmark programs
└── docs/ # Documentation
Coding Standards
Naming Conventions:
Classes:
CamelCase(e.g.,PointerAnalysis)Functions:
camelCase(e.g.,getPointsToSet)Variables:
snake_case(e.g.,points_to_set)Constants:
UPPER_CASE(e.g.,MAX_ITERATIONS)Member variables: prefix with
m_or_
Code Style:
2-space indentation
100 character line limit
Use LLVM coding standards where applicable
Add Doxygen comments for public APIs
Adding a New Alias Analysis
Step 1: Create Directory Structure
cd lotus
mkdir -p include/Alias/MyAA
mkdir -p lib/Alias/MyAA
Step 2: Define Header
Create include/Alias/MyAA/MyAliasAnalysis.h:
#pragma once
#include "llvm/Pass.h"
#include "llvm/IR/Module.h"
#include "llvm/Analysis/AliasAnalysis.h"
#include <set>
#include <map>
namespace myaa {
/**
* @brief My custom alias analysis
*
* This analysis computes pointer alias information using [describe algorithm].
*/
class MyAliasAnalysis : public llvm::ModulePass,
public llvm::AliasAnalysis {
public:
static char ID;
MyAliasAnalysis() : ModulePass(ID) {}
/// Main analysis entry point
bool runOnModule(llvm::Module &M) override;
/// Query if two values may alias
llvm::AliasResult alias(const llvm::Value *V1,
const llvm::Value *V2) override;
/// Get analysis usage
void getAnalysisUsage(llvm::AnalysisUsage &AU) const override;
/// Get pass name
llvm::StringRef getPassName() const override {
return "My Alias Analysis";
}
private:
/// Points-to sets: pointer -> objects it may point to
std::map<const llvm::Value*, std::set<const llvm::Value*>> points_to_;
/// Compute points-to sets
void computePointsTo(llvm::Module &M);
/// Process a single instruction
void processInstruction(llvm::Instruction *I);
};
} // namespace myaa
Step 3: Implement Analysis
Create lib/Alias/MyAA/MyAliasAnalysis.cpp:
#include "Alias/MyAA/MyAliasAnalysis.h"
#include "llvm/IR/Instructions.h"
#include "llvm/Support/raw_ostream.h"
using namespace llvm;
using namespace myaa;
char MyAliasAnalysis::ID = 0;
bool MyAliasAnalysis::runOnModule(Module &M) {
// Initialize points-to sets
points_to_.clear();
// Compute points-to information
computePointsTo(M);
return false; // Did not modify the module
}
void MyAliasAnalysis::computePointsTo(Module &M) {
for (auto &F : M) {
if (F.isDeclaration()) continue;
for (auto &BB : F) {
for (auto &I : BB) {
processInstruction(&I);
}
}
}
}
void MyAliasAnalysis::processInstruction(Instruction *I) {
// Handle different instruction types
if (auto *alloca = dyn_cast<AllocaInst>(I)) {
// Alloca creates a new object
points_to_[alloca].insert(alloca);
} else if (auto *load = dyn_cast<LoadInst>(I)) {
// Load: propagate points-to from pointer
Value *ptr = load->getPointerOperand();
if (points_to_.count(ptr)) {
for (auto *obj : points_to_[ptr]) {
// load result points to objects pointed by ptr
points_to_[load].insert(obj);
}
}
} else if (auto *store = dyn_cast<StoreInst>(I)) {
// Store: update points-to at target location
Value *ptr = store->getPointerOperand();
Value *val = store->getValueOperand();
// Handle store logic...
} else if (auto *gep = dyn_cast<GetElementPtrInst>(I)) {
// GEP: field access
Value *base = gep->getPointerOperand();
if (points_to_.count(base)) {
points_to_[gep] = points_to_[base];
}
}
// Add more cases as needed...
}
AliasResult MyAliasAnalysis::alias(const Value *V1, const Value *V2) {
// Check if points-to sets intersect
if (!points_to_.count(V1) || !points_to_.count(V2)) {
return MayAlias; // Conservative
}
const auto &pts1 = points_to_[V1];
const auto &pts2 = points_to_[V2];
// Check for intersection
bool hasIntersection = false;
for (auto *obj : pts1) {
if (pts2.count(obj)) {
hasIntersection = true;
break;
}
}
if (!hasIntersection) {
return NoAlias;
} else if (pts1 == pts2) {
return MustAlias;
} else {
return MayAlias;
}
}
void MyAliasAnalysis::getAnalysisUsage(AnalysisUsage &AU) const {
AU.setPreservesAll();
}
// Register pass
static RegisterPass<MyAliasAnalysis> X("my-aa", "My Alias Analysis");
Step 4: Add CMakeLists.txt
Create lib/Alias/MyAA/CMakeLists.txt:
set(SOURCES
MyAliasAnalysis.cpp
)
add_library(LotusMyAA STATIC ${SOURCES})
target_include_directories(LotusMyAA PUBLIC
${CMAKE_SOURCE_DIR}/include
${LLVM_INCLUDE_DIRS}
)
target_link_libraries(LotusMyAA
LLVMCore
LLVMSupport
LLVMAnalysis
)
Update lib/Alias/CMakeLists.txt:
add_subdirectory(MyAA)
Step 5: Create Command-Line Tool
Create tools/alias/my-aa.cpp:
#include "llvm/IR/LLVMContext.h"
#include "llvm/IR/Module.h"
#include "llvm/IRReader/IRReader.h"
#include "llvm/Support/SourceMgr.h"
#include "llvm/IR/LegacyPassManager.h"
#include "llvm/Support/CommandLine.h"
#include "Alias/MyAA/MyAliasAnalysis.h"
using namespace llvm;
cl::opt<std::string> InputFilename(cl::Positional,
cl::desc("<input bitcode>"),
cl::Required);
cl::opt<bool> Verbose("verbose",
cl::desc("Print detailed output"),
cl::init(false));
int main(int argc, char **argv) {
cl::ParseCommandLineOptions(argc, argv, "My Alias Analysis Tool\n");
LLVMContext context;
SMDiagnostic error;
// Load module
std::unique_ptr<Module> module = parseIRFile(InputFilename, error, context);
if (!module) {
error.print(argv[0], errs());
return 1;
}
// Run analysis
legacy::PassManager PM;
PM.add(new myaa::MyAliasAnalysis());
PM.run(*module);
errs() << "Analysis completed successfully\n";
return 0;
}
Update tools/alias/CMakeLists.txt:
add_executable(my-aa my-aa.cpp)
target_link_libraries(my-aa
LotusMyAA
${LLVM_LIBS}
)
install(TARGETS my-aa DESTINATION bin)
Step 6: Test Your Analysis
Create tests/alias/test_my_aa.cpp:
int main() {
int x = 10;
int y = 20;
int *p = &x;
int *q = &y;
// p and q should not alias
return 0;
}
Test:
clang -emit-llvm -c tests/alias/test_my_aa.cpp -o test.bc
./build/bin/my-aa test.bc
Adding a New Bug Checker
Step 1: Define Checker Interface
Create include/Checker/MyChecker.h:
#pragma once
#include "llvm/Pass.h"
#include "llvm/IR/Module.h"
#include "Checker/Report/BugReportMgr.h"
#include <vector>
namespace lotus {
/**
* @brief Checker for [describe bug type]
*/
class MyChecker : public llvm::ModulePass {
public:
static char ID;
MyChecker() : ModulePass(ID) {}
bool runOnModule(llvm::Module &M) override;
llvm::StringRef getPassName() const override {
return "My Bug Checker";
}
private:
/// Detected bugs
std::vector<BugReport> bugs_;
/// Check a single function
void checkFunction(llvm::Function &F);
/// Report a bug
void reportBug(llvm::Instruction *I, const std::string &description);
};
} // namespace lotus
Step 2: Implement Checker
Create lib/Checker/MyChecker.cpp:
#include "Checker/MyChecker.h"
#include "llvm/IR/Instructions.h"
#include "llvm/Support/raw_ostream.h"
using namespace llvm;
using namespace lotus;
char MyChecker::ID = 0;
bool MyChecker::runOnModule(Module &M) {
bugs_.clear();
for (auto &F : M) {
if (F.isDeclaration()) continue;
checkFunction(F);
}
// Report all bugs
errs() << "Found " << bugs_.size() << " bugs\n";
for (auto &bug : bugs_) {
errs() << "Bug at " << bug.location << ": "
<< bug.description << "\n";
}
return false;
}
void MyChecker::checkFunction(Function &F) {
for (auto &BB : F) {
for (auto &I : BB) {
// Example: check for specific bug patterns
if (auto *call = dyn_cast<CallInst>(&I)) {
Function *callee = call->getCalledFunction();
if (callee && callee->getName() == "dangerous_function") {
reportBug(call, "Call to dangerous function");
}
}
}
}
}
void MyChecker::reportBug(Instruction *I, const std::string &description) {
BugReport bug;
bug.type = "MyBugType";
bug.instruction = I;
bug.description = description;
// Get source location if available
if (auto *DL = I->getDebugLoc().get()) {
bug.file = DL->getFilename().str();
bug.line = DL->getLine();
}
bugs_.push_back(bug);
}
static RegisterPass<MyChecker> X("my-checker", "My Bug Checker");
Step 3: Integrate with BugReportMgr
For SARIF/JSON output support:
#include "Checker/Report/BugReportMgr.h"
bool MyChecker::runOnModule(Module &M) {
// ... run checks ...
// Report to centralized manager
BugReportMgr &mgr = BugReportMgr::getInstance();
for (auto &bug : bugs_) {
mgr.addBug("MyChecker", bug);
}
return false;
}
Adding a New Abstract Domain
For CLAM integration, create a new CRAB domain:
Step 1: Define Domain Interface
Create lib/Verification/clam/crab/domains/my_domain.hpp:
#pragma once
#include "crab/domains/abstract_domain.hpp"
#include <map>
namespace crab {
namespace domains {
template<typename Number, typename VariableName>
class my_domain : public abstract_domain<Number, VariableName> {
public:
using variable_t = VariableName;
using number_t = Number;
// Required lattice operations
my_domain() : is_bottom_(false) {}
static my_domain top() {
return my_domain(false);
}
static my_domain bottom() {
my_domain d;
d.is_bottom_ = true;
return d;
}
bool is_bottom() const { return is_bottom_; }
bool is_top() const { return !is_bottom_ && state_.empty(); }
// Join (least upper bound)
void operator|=(const my_domain &other) {
if (is_bottom()) {
*this = other;
} else if (other.is_bottom()) {
return;
} else {
// Implement join logic
}
}
// Meet (greatest lower bound)
void operator&=(const my_domain &other) {
// Implement meet logic
}
// Widening
my_domain operator||(const my_domain &other) {
// Implement widening for convergence
return *this;
}
// Transfer functions
void assign(variable_t x, Number n) {
// x := n
state_[x] = n;
}
void assign(variable_t x, variable_t y) {
// x := y
if (state_.count(y)) {
state_[x] = state_[y];
}
}
void apply(operation_t op, variable_t x, variable_t y, variable_t z) {
// x := y op z
// Implement operation semantics
}
void operator+=(constraint_t cst) {
// Add constraint
}
// Queries
bool entails(constraint_t cst) const {
// Check if domain entails constraint
return false;
}
void write(std::ostream &o) const {
if (is_bottom()) {
o << "_|_";
} else if (is_top()) {
o << "⊤";
} else {
o << "{";
for (auto &[var, val] : state_) {
o << var << "=" << val << " ";
}
o << "}";
}
}
private:
bool is_bottom_;
std::map<variable_t, number_t> state_;
};
} // namespace domains
} // namespace crab
Step 2: Register Domain
Update lib/Verification/clam/ClamDomainRegistry.cc:
#include "crab/domains/my_domain.hpp"
// In domain factory
if (dom_name == "mydomain") {
return std::make_unique<my_domain<number_t, varname_t>>();
}
Step 3: Add Command-Line Option
Update tools/verifier/clam/clam.cc:
// Add to domain options
cl::opt<std::string> CrabDomain(
"crab-dom",
cl::desc("Abstract domain"),
cl::values(
clEnumValN(INTERVALS, "int", "Classical interval domain"),
clEnumValN(MY_DOMAIN, "mydomain", "My custom domain"),
// ...
),
cl::init(INTERVALS));
Adding a Data Flow Analysis
Using the IFDS/IDE framework:
Step 1: Define Analysis Problem
Create lib/Dataflow/MyDataFlow/MyAnalysis.h:
#pragma once
#include "Dataflow/IFDS/IFDSProblem.h"
namespace lotus {
/**
* @brief My custom data flow analysis
*/
class MyAnalysis : public IFDSProblem<llvm::Value*, llvm::Function*> {
public:
using Fact = llvm::Value*;
using Context = llvm::Function*;
MyAnalysis(llvm::Module &M) : IFDSProblem(M) {}
// Initial facts (sources)
std::set<Fact> initialSeeds() override;
// Normal flow within basic blocks
FlowFunctionPtr getNormalFlowFunction(
llvm::Instruction *curr,
llvm::Instruction *succ) override;
// Call flow (caller -> callee)
FlowFunctionPtr getCallFlowFunction(
llvm::Instruction *callSite,
llvm::Function *destFun) override;
// Return flow (callee -> caller)
FlowFunctionPtr getReturnFlowFunction(
llvm::Instruction *callSite,
llvm::Function *calleeFun,
llvm::Instruction *exitInst) override;
// Call-to-return flow (bypass call)
FlowFunctionPtr getCallToReturnFlowFunction(
llvm::Instruction *callSite,
llvm::Instruction *returnSite) override;
};
} // namespace lotus
Step 2: Implement Flow Functions
Create lib/Dataflow/MyDataFlow/MyAnalysis.cpp:
#include "Dataflow/MyDataFlow/MyAnalysis.h"
#include "Dataflow/IFDS/FlowFunctions.h"
using namespace llvm;
using namespace lotus;
std::set<Value*> MyAnalysis::initialSeeds() {
std::set<Value*> seeds;
// Add initial facts (e.g., function parameters, globals)
for (auto &F : module) {
if (F.getName() == "source_function") {
// Return value is a source
seeds.insert(&F);
}
}
return seeds;
}
FlowFunctionPtr MyAnalysis::getNormalFlowFunction(
Instruction *curr, Instruction *succ) {
// Identity flow by default
if (!affectsAnalysis(curr)) {
return std::make_shared<Identity<Fact>>();
}
// Generate new facts
if (auto *alloca = dyn_cast<AllocaInst>(curr)) {
return std::make_shared<Gen<Fact>>(alloca);
}
// Kill facts
if (isSanitizer(curr)) {
return std::make_shared<KillAll<Fact>>();
}
// Conditional generation
return std::make_shared<ConditionalGen<Fact>>(curr,
[](Fact f) { return shouldPropagate(f); });
}
FlowFunctionPtr MyAnalysis::getCallFlowFunction(
Instruction *callSite, Function *destFun) {
// Map actual parameters to formals
auto *call = cast<CallInst>(callSite);
return std::make_shared<MapParametersFlowFunction>(
call, destFun);
}
FlowFunctionPtr MyAnalysis::getReturnFlowFunction(
Instruction *callSite, Function *calleeFun,
Instruction *exitInst) {
// Map return value to call site
if (auto *ret = dyn_cast<ReturnInst>(exitInst)) {
return std::make_shared<MapReturnFlowFunction>(
callSite, ret);
}
return std::make_shared<Identity<Fact>>();
}
Step 3: Create Solver and Run
#include "Dataflow/IFDS/IFDSSolver.h"
#include "Dataflow/MyDataFlow/MyAnalysis.h"
// Create problem
MyAnalysis problem(module);
// Create solver
IFDSSolver<Value*, Function*> solver(problem);
// Solve
solver.solve();
// Query results
for (auto &F : module) {
for (auto &BB : F) {
for (auto &I : BB) {
auto facts = solver.ifdsResultsAt(&I);
// Process facts...
}
}
}
Testing and Debugging
Writing Unit Tests
Create tests/unit/test_my_analysis.cpp:
#include "gtest/gtest.h"
#include "llvm/IR/LLVMContext.h"
#include "llvm/IR/Module.h"
#include "Alias/MyAA/MyAliasAnalysis.h"
class MyAnalysisTest : public ::testing::Test {
protected:
llvm::LLVMContext context;
std::unique_ptr<llvm::Module> module;
void SetUp() override {
module = std::make_unique<llvm::Module>("test", context);
}
};
TEST_F(MyAnalysisTest, BasicAliasQuery) {
// Create test IR
auto *funcType = llvm::FunctionType::get(
llvm::Type::getVoidTy(context), false);
auto *func = llvm::Function::Create(
funcType, llvm::Function::ExternalLinkage, "test", module.get());
auto *bb = llvm::BasicBlock::Create(context, "entry", func);
// Add instructions...
// Run analysis
myaa::MyAliasAnalysis aa;
aa.runOnModule(*module);
// Check results
auto result = aa.alias(val1, val2);
EXPECT_EQ(result, llvm::NoAlias);
}
Run tests:
cd build
make test
Debugging Tips
1. Enable debug output:
#define DEBUG_TYPE "my-analysis"
#include "llvm/Support/Debug.h"
LLVM_DEBUG(dbgs() << "Processing instruction: " << *I << "\n");
Run with debug output:
./my-tool -debug-only=my-analysis input.bc
2. Dump intermediate results:
errs() << "Points-to set for " << *ptr << ":\n";
for (auto *obj : pts) {
errs() << " - " << *obj << "\n";
}
3. Use LLVM’s verification passes:
PM.add(createVerifierPass()); // Check IR validity
4. GDB debugging:
gdb --args ./my-tool input.bc
(gdb) break MyAliasAnalysis::runOnModule
(gdb) run
Performance Optimization
Profiling
Use LLVM’s time-passes:
./my-tool -time-passes input.bc
Optimization Strategies
Use Sparse Data Structures: BitVector for points-to sets
Cache Results: Memoize expensive computations
Early Termination: Detect fixpoints early
SCC Collapsing: Merge strongly connected components
Incremental Analysis: Reuse results across runs
Example caching:
class CachedAnalysis {
std::map<Value*, PointsToSet> cache_;
public:
const PointsToSet& getPointsTo(Value *v) {
auto it = cache_.find(v);
if (it != cache_.end()) {
return it->second; // Return cached result
}
// Compute and cache
PointsToSet pts = computePointsTo(v);
cache_[v] = pts;
return cache_[v];
}
};
Contributing to Lotus
Contribution Workflow
Fork the repository on GitHub
Create a feature branch:
git checkout -b my-featureImplement your changes with tests
Document your changes in relevant docs
Test thoroughly:
make testCommit with clear messages:
git commit -m "Add my feature"Push:
git push origin my-featureCreate a pull request on GitHub
Code Review Process
All code must pass CI tests
At least one approval from maintainers required
Follow coding standards
Add documentation and tests
Keep PRs focused and manageable
Documentation Updates
When adding new features:
Update relevant
.rstfiles indocs/source/Add API documentation in
api_reference.rstAdd tutorial in
tutorials.rstif applicableUpdate
TOOLS.mdfor new command-line tools
Build documentation:
cd docs
make html
# View at docs/build/html/index.html
Common Pitfalls
Memory Management
Problem: LLVM uses custom allocators and memory management.
Solution: Use LLVM’s memory management properly:
// Don't do this:
BasicBlock *bb = new BasicBlock(...);
// Do this:
BasicBlock *bb = BasicBlock::Create(context, "name", func);
Problem: Dangling pointers after IR modification.
Solution: Use Value::replaceAllUsesWith() and proper IR building.
Pass Dependencies
Problem: Analysis depends on another pass but doesn’t declare it.
Solution: Declare dependencies in getAnalysisUsage():
void getAnalysisUsage(AnalysisUsage &AU) const override {
AU.addRequired<DominatorTreeWrapperPass>();
AU.addRequired<DyckAliasAnalysis>();
AU.setPreservesAll();
}
Fixpoint Convergence
Problem: Analysis doesn’t converge (infinite loop).
Solution: Implement proper widening and set iteration limit:
const int MAX_ITERATIONS = 100;
int iter = 0;
while (changed && iter < MAX_ITERATIONS) {
// Analysis logic...
iter++;
}
if (iter >= MAX_ITERATIONS) {
errs() << "Warning: Analysis did not converge\n";
}
Resources
LLVM Documentation: https://llvm.org/docs/
LLVM Programmer’s Manual: https://llvm.org/docs/ProgrammersManual.html
Writing LLVM Pass: https://llvm.org/docs/WritingAnLLVMPass.html
Lotus Repository: https://github.com/ZJU-PL/lotus
Issue Tracker: https://github.com/ZJU-PL/lotus/issues
Community
GitHub Discussions: For questions and discussions
Issue Tracker: For bug reports and feature requests
Pull Requests: For code contributions
See Also
Architecture Overview - Framework architecture
API Reference - API documentation
Tutorials and Examples - Usage examples