Bug Detection with Lotus ========================= Lotus provides comprehensive bug detection capabilities for finding security vulnerabilities and safety issues in C/C++ programs. Overview -------- Lotus includes multiple bug detection tools: 1. **Kint**: Integer-related bugs (overflow, division by zero, bad shift, array bounds) 2. **Taint Analysis**: Information flow and injection vulnerabilities 3. **Concurrency Checker**: Race conditions, deadlocks, OpenMP misuse, and MPI protocol bugs Bug Categories -------------- Memory Safety ~~~~~~~~~~~~~ **Null Pointer Dereference**: .. code-block:: c Person *p = create_person(-1); // May return NULL printf("%s\n", p->name); // Bug: potential NPD **Use-After-Free**: .. code-block:: c int *ptr = malloc(sizeof(int)); free(ptr); *ptr = 42; // Bug: use after free **Buffer Overflow**: .. code-block:: c char buffer[10]; strcpy(buffer, user_input); // Bug: may overflow Integer Safety ~~~~~~~~~~~~~~ **Integer Overflow**: .. code-block:: c int x = INT_MAX; int y = x + 1; // Bug: signed overflow **Division by Zero**: .. code-block:: c int result = x / y; // Bug if y == 0 **Bad Shift**: .. code-block:: c int result = x << 32; // Bug: shift amount >= width Information Flow ~~~~~~~~~~~~~~~~ **Tainted Data Flow**: .. code-block:: c char cmd[256]; scanf("%s", cmd); // Tainted source system(cmd); // Bug: unsanitized input to sink **Information Leakage**: .. code-block:: c char *secret = getPassword(); sendToNetwork(secret); // Bug: confidential data leaked Concurrency ~~~~~~~~~~~ **Data Race**: .. code-block:: c int shared_counter; // Global void thread_func() { shared_counter++; // Bug: unprotected access } **Deadlock**: .. code-block:: c pthread_mutex_lock(&lock1); pthread_mutex_lock(&lock2); // Bug: lock ordering issue Tool 1: Kint - Integer and Array Analysis ------------------------------------------ Kint detects integer-related and taint-style bugs using range analysis and SMT solving. Capabilities ~~~~~~~~~~~~ 1. **Integer Overflow Detection**: Signed and unsigned overflow 2. **Division by Zero**: Detect potential divisions by zero 3. **Bad Shift**: Invalid shift amounts 4. **Array Out of Bounds**: Buffer overflows and underflows 5. **Dead Branch**: Impossible conditional branches Basic Usage ~~~~~~~~~~~ .. code-block:: bash # Enable all checkers ./build/bin/lotus-kint -check-all program.ll # Enable specific checkers ./build/bin/lotus-kint -check-int-overflow -check-div-by-zero program.ll # Set timeout for slow functions ./build/bin/lotus-kint -check-all -function-timeout=60 program.ll Example 1: Integer Overflow ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ **Vulnerable Code** (``overflow.c``): .. code-block:: c #include int calculate_size(int count, int item_size) { return count * item_size; // Potential overflow } int main() { int size = calculate_size(1000000, 1000); // Overflows char *buffer = malloc(size); // Size is wrong due to overflow return 0; } **Detection**: .. code-block:: bash clang -emit-llvm -S -g overflow.c -o overflow.ll ./build/bin/lotus-kint -check-int-overflow overflow.ll **Expected Output**: .. code-block:: text [Integer Overflow] Function: calculate_size Location: overflow.c:4 Instruction: %mul = mul nsw i32 %count, %item_size Reason: Multiplication may overflow Range: count ∈ [-∞, +∞], item_size ∈ [-∞, +∞] **Fix**: .. code-block:: c #include #include int calculate_size(int count, int item_size) { // Check for overflow before multiplication if (count > 0 && item_size > INT_MAX / count) { return -1; // Error: would overflow } return count * item_size; } Example 2: Array Out of Bounds ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ **Vulnerable Code** (``buffer.c``): .. code-block:: c void process_array(int *arr, int index, int value) { arr[index] = value; // No bounds check } int main() { int buffer[10]; int user_index; scanf("%d", &user_index); process_array(buffer, user_index, 42); // Bug if index >= 10 return 0; } **Detection**: .. code-block:: bash clang -emit-llvm -S -g buffer.c -o buffer.ll ./build/bin/lotus-kint -check-array-oob buffer.ll **Expected Output**: .. code-block:: text [Array Out of Bounds] Function: process_array Location: buffer.c:2 Array size: 10 elements Index range: [-∞, +∞] Potential overflow: index may be >= 10 **Fix**: .. code-block:: c void process_array(int *arr, int size, int index, int value) { if (index < 0 || index >= size) { return; // Invalid index } arr[index] = value; } Example 3: Division by Zero ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ **Vulnerable Code** (``division.c``): .. code-block:: c int calculate_average(int total, int count) { return total / count; // Bug if count == 0 } int main() { int avg1 = calculate_average(100, 10); // OK int avg2 = calculate_average(100, 0); // Bug! return 0; } **Detection**: .. code-block:: bash clang -emit-llvm -S -g division.c -o division.ll ./build/bin/lotus-kint -check-div-by-zero division.ll **Fix**: .. code-block:: c int calculate_average(int total, int count) { if (count == 0) { return 0; // Handle zero case } return total / count; } Tool 2: Taint Analysis ----------------------- Interprocedural taint analysis using the IFDS framework to track information flow. Capabilities ~~~~~~~~~~~~ 1. **Source Tracking**: Identify tainted data sources 2. **Sink Detection**: Find dangerous operations 3. **Path Finding**: Compute flows from sources to sinks 4. **Custom Sources/Sinks**: User-defined taint specifications Basic Usage ~~~~~~~~~~~ .. code-block:: bash # Basic taint analysis ./build/bin/lotus-taint program.bc # Custom sources and sinks ./build/bin/lotus-taint -sources="read,scanf,recv" \ -sinks="system,exec,printf" \ program.bc # Verbose output ./build/bin/lotus-taint -verbose -max-results=20 program.bc Example 1: Command Injection ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ **Vulnerable Code** (``cmd_injection.c``): .. code-block:: c #include #include #include void execute_user_command(const char *user_input) { char command[256]; sprintf(command, "cat %s", user_input); system(command); // Bug: command injection } char *sanitize(const char *input) { // Remove dangerous characters char *output = strdup(input); for (int i = 0; output[i]; i++) { if (output[i] == ';' || output[i] == '|' || output[i] == '&') { output[i] = '_'; } } return output; } int main() { char user_file[256]; printf("Enter filename: "); scanf("%255s", user_file); // Source of tainted data // Vulnerable: direct use execute_user_command(user_file); // Safe: sanitized char *safe_file = sanitize(user_file); execute_user_command(safe_file); free(safe_file); return 0; } **Detection**: .. code-block:: bash clang -emit-llvm -c -g cmd_injection.c -o cmd_injection.bc ./build/bin/lotus-taint -verbose cmd_injection.bc **Expected Output**: .. code-block:: text [Taint Flow Detected] Source: scanf at cmd_injection.c:24 Sink: system at cmd_injection.c:8 (via execute_user_command) Flow path: 1. scanf("%255s", user_file) - line 24 2. user_file is tainted 3. execute_user_command(user_file) - line 27 4. sprintf(command, "cat %s", user_input) - line 7 5. system(command) - line 8 Vulnerability: User input flows to system() without sanitization **Fix**: Always sanitize user input or use parameterized APIs: .. code-block:: c void execute_user_command(const char *user_input) { // Validate input if (strchr(user_input, ';') || strchr(user_input, '|')) { fprintf(stderr, "Invalid filename\n"); return; } // Use safe API execlp("cat", "cat", user_input, NULL); } Example 2: SQL Injection ~~~~~~~~~~~~~~~~~~~~~~~~ **Vulnerable Code** (``sql_injection.c``): .. code-block:: c void search_user(sqlite3 *db, const char *username) { char query[512]; sprintf(query, "SELECT * FROM users WHERE name='%s'", username); sqlite3_exec(db, query, callback, 0, &err); // Bug: SQL injection } int main() { char input[256]; scanf("%s", input); // Tainted search_user(db, input); // Bug return 0; } **Detection**: .. code-block:: bash ./build/bin/lotus-taint -sources="scanf" -sinks="sqlite3_exec" sql_injection.bc **Fix**: Use parameterized queries: .. code-block:: c void search_user(sqlite3 *db, const char *username) { sqlite3_stmt *stmt; const char *query = "SELECT * FROM users WHERE name=?"; sqlite3_prepare_v2(db, query, -1, &stmt, NULL); sqlite3_bind_text(stmt, 1, username, -1, SQLITE_TRANSIENT); sqlite3_step(stmt); sqlite3_finalize(stmt); } Tool 3: Concurrency Checker ---------------------------- Detects concurrency bugs in multi-threaded programs. Capabilities ~~~~~~~~~~~~ 1. **Data Race Detection**: Unprotected shared memory access 2. **Deadlock Detection**: Lock ordering issues 3. **Atomicity Violations**: Non-atomic operation sequences 4. **Lock/Unlock Pairing**: Mismatched lock operations 5. **OpenMP Checks**: Taskgroup/atomic region mismatches and partial task synchronization 6. **MPI Checks**: Request lifecycle bugs, collective mismatches, blocking deadlocks, and RMA issues Basic Usage ~~~~~~~~~~~ .. code-block:: bash ./build/bin/lotus-concur program.bc ./build/bin/lotus-concur -verbose program.bc ./build/bin/lotus-concur --checks=openmp,mpi program.bc OpenMP and MPI Examples ~~~~~~~~~~~~~~~~~~~~~~~ The concurrency checker also includes dedicated runtime-aware checks for OpenMP and MPI programs. **OpenMP examples**: * unmatched ``__kmpc_taskgroup`` / ``__kmpc_end_taskgroup`` pairs * unmatched ``__kmpc_atomic_start`` / ``__kmpc_atomic_end`` pairs * selective ``__kmpc_omp_wait_deps`` synchronization that may not fully order sibling tasks **MPI examples**: * ``MPI_Isend`` / ``MPI_Irecv`` requests that are never completed * mismatched or rank-conditional collectives * simple blocking send/recv deadlock cycles * unsynchronized RMA operations or leaked windows For MPI/OpenMP-heavy codebases, a useful first pass is: .. code-block:: bash ./build/bin/lotus-concur --checks=openmp,mpi --report-json=parallel.json program.bc Example: Data Race ~~~~~~~~~~~~~~~~~~ **Vulnerable Code** (``race.c``): .. code-block:: c #include int shared_counter = 0; pthread_mutex_t lock = PTHREAD_MUTEX_INITIALIZER; void *thread1(void *arg) { for (int i = 0; i < 1000; i++) { shared_counter++; // Bug: unprotected } return NULL; } void *thread2(void *arg) { for (int i = 0; i < 1000; i++) { pthread_mutex_lock(&lock); shared_counter++; // Protected pthread_mutex_unlock(&lock); } return NULL; } int main() { pthread_t t1, t2; pthread_create(&t1, NULL, thread1, NULL); pthread_create(&t2, NULL, thread2, NULL); pthread_join(t1, NULL); pthread_join(t2, NULL); printf("Counter: %d\n", shared_counter); return 0; } **Detection**: .. code-block:: bash clang -emit-llvm -c -g -pthread race.c -o race.bc ./build/bin/lotus-concur -verbose race.bc **Expected Output**: .. code-block:: text [Data Race Detected] Variable: shared_counter (global) Thread 1 (thread1): - Location: race.c:8 - Operation: increment (write) - Lock status: UNPROTECTED Thread 2 (thread2): - Location: race.c:16 - Operation: increment (write) - Lock status: PROTECTED by lock Conflict: Both threads access shared_counter, but thread1 is unprotected **Fix**: .. code-block:: c void *thread1(void *arg) { for (int i = 0; i < 1000; i++) { pthread_mutex_lock(&lock); shared_counter++; // Now protected pthread_mutex_unlock(&lock); } return NULL; } Best Practices -------------- 1. **Start with Fast Checks**: .. code-block:: bash # Quick scan with Kint ./build/bin/lotus-kint -check-all program.ll 2. **Use Appropriate Checkers**: - Integer bugs → Kint - Information flow → Taint analysis - Concurrency → Concurrency checker 3. **Iterative Analysis**: - Fix obvious bugs first - Re-run with higher precision - Validate with dynamic analysis 4. **Combine with Testing**: - Use static analysis to find bugs - Write tests for found vulnerabilities - Verify fixes with both static and dynamic analysis 5. **Reduce False Positives**: - Add assertions to help analysis - Use more precise analyses - Validate with DynAA Output Formats -------------- Text Output ~~~~~~~~~~~ Default human-readable output: .. code-block:: bash ./build/bin/lotus-kint -check-all program.ll JSON Output ~~~~~~~~~~~ Machine-readable JSON for integration: .. code-block:: bash ./build/bin/lotus-kint -check-all -output-json=results.json program.ll SARIF Output ~~~~~~~~~~~~ Standard format for security tools: .. code-block:: bash ./build/bin/lotus-kint -check-all -output-sarif=results.sarif program.ll Integration Examples -------------------- CI/CD Pipeline ~~~~~~~~~~~~~~ .. code-block:: bash #!/bin/bash # Build program clang -emit-llvm -c -g source.c -o source.bc # Run checkers ./lotus-kint -check-all source.ll > kint_results.txt ./lotus-taint source.bc > taint_results.txt # Check for bugs if grep -q "Bug" *.txt; then echo "Bugs found!" exit 1 fi VS Code Integration ~~~~~~~~~~~~~~~~~~~ Use SARIF output with VS Code SARIF viewer extension. See Also -------- - :doc:`tutorials` - Hands-on examples - :doc:`troubleshooting` - Common issues - :doc:`../tools/checker` - Detailed tool documentation - :doc:`../developer/api_reference` - Programmatic usage