===========================================
Strict Relations Alias Analysis — Algorithm
===========================================

Overview
========

Strict Relations Alias Analysis (SRAA) is a pointer-alias analysis that
proves pointers *cannot* alias by establishing **strict (<, ≤) relations**
between symbolic numeric expressions that describe memory objects.
The technique combines:

* An SSI/e-SSA-style live-range–splitting transformation,
* A symbolic Range Analysis pass, and
* A constraint solver for strict inequalities.

* **Location**: ``lib/Alias/SRAA``

Highlights
==========

* Proves ``NoAlias`` results by reasoning about strict inequalities on pointer offsets
* Uses ``InterProceduralRA`` to infer value ranges and propagate constraints
* Tracks primitive layouts of aggregate types to compare sub-field accesses

Usage
=====

Register the ``StrictRelations`` pass (``-sraa``) inside an LLVM pass pipeline or via the Lotus AA wrapper.

This document focuses solely on the algorithmic workflow of SRAA as implemented in the artifact.

Inputs / Outputs / Invariants
-----------------------------

* **Inputs:** LLVM IR in SSA form, size metadata for abstract memory objects, target ABI layout rules.
* **Outputs:** Pairs of pointers classified as `NoAlias` (strictly proven) or `MayAlias`.
* **Invariants:** symbolic expressions remain affine, constraints are monotone, and transformations preserve IR semantics.

-------------------------------------
Algorithmic Pipeline (High-Level View)
-------------------------------------

The analysis is structured as a pipeline:

1. **Live-Range Splitting (vSSA)**
2. **Range Analysis (RA)**
3. **Strict-Inequality Generation**
4. **Constraint Propagation**
5. **Alias Disambiguation**
6. **(Optional) PDG node reduction**

Data products exchanged between stages:

* vSSA emits per-version def-use chains and symbolic expressions.
* RA outputs symbolic lower/upper bounds per versioned value.
* Inequality construction synthesizes strict constraints in canonical form.
* The solver produces disjointness facts and contradiction reports for diagnostics.

Each stage is described in detail below.

==============================
1. vSSA / SSI Transformation
==============================

SRAA requires precise symbolic information per program point.
Therefore, the implementation first transforms the IR into a variant of SSI
(Sparse Static Single Information) called **vSSA**.

The vSSA transformation performs:

1. **Splitting at control-flow joins**  
   Insert φ-like nodes for variables whose value-flow merges.

2. **Splitting at control-flow forks**  
   Insert σ-like nodes to preserve flow sensitivity of interval constraints.

3. **Splitting at uses of pointer-index expressions**  
   Ensures every symbolic expression has a single definition site.

Formally:

For each variable *x*:

* Insert φ where `x_in(B1)` ≠ `x_in(B2)`
* Insert σ at outgoing edges when `range(x)` changes on different successors
* Create a unique version `x_i` for each definition/merge/split

Implementation notes:

* vSSA runs after mem2reg and before alias analysis passes that expect canonical pointers.
* Synthetic definitions are tagged so later passes can ignore them when materializing code.
* The transformation keeps debug metadata and dominance info in sync, ensuring downstream passes receive well-formed IR.

This results in a def-use graph suitable for sparse constraint propagation.

===========================
2. Symbolic Range Analysis
===========================

The Range Analysis pass computes symbolic ranges of the form:

x ∈ [L(x), U(x)]

where `L(x)`, `U(x)` are *symbolic linear expressions* derived from:

* loop induction variables  
* pointer arithmetic (GEP)  
* branch conditions  
* integer arithmetic  
* comparisons

Example symbolic constraints inferred:

i_3 = i_2 + 1 ⇒ L(i_3) = L(i_2)+1, U(i_3) = U(i_2)+1
if (i < n) ⇒ U(i) ≤ U(n)-1


Propagation strategy:

1. Seed bounds using constants, known loop trip counts, and structure layout info.
2. Push constraints along vSSA def-use edges in topological order.
3. Relax bounds when a tighter symbolic expression is discovered.
4. Iterate until no bound changes (typically a small number of rounds because the graph is sparse).

The RA solver is **linear**, **sparse**, and propagates constraints until fixed point. It records justification edges so later stages can reconstruct the proof chain that produced every bound.

====================================
3. Strict-Inequality Construction
====================================

Given pointer expressions:


p = base + offset_p
q = base + offset_q


SRAA constructs strict inequalities such as:


offset_p + size_p ≤ offset_q
offset_q + size_q ≤ offset_p
offset_p < offset_q


These are derived from:

* value ranges from RA  
* control-flow–guarded conditions  
* loop bounds  
* comparisons in the IR (e.g., `if (i < j)` introduces `i < j` constraint)

Each inequality is represented in canonical form:

ax + by + ... < k

Canonicalization steps:

* Normalize coefficients so gcd is 1, making syntactic deduplication easier.
* Replace non-strict constraints generated by RA (≤) with equivalent strict forms when object sizes allow.
* Attach provenance (IR instruction, dominance frontier) for auditing and debugging.

The system contains only affine expressions.

================================
4. Constraint Propagation Solver
================================

SRAA maintains a strict-inequality constraint graph:

* Nodes: symbolic variables (offsets, indices, derived expressions)
* Edges: inequalities of form `x < y + c`

Algorithm:

1. Initialize graph with inequalities from Range Analysis.
2. For every edge `(x → y)` labeled `+c`, propagate:  
   `L(y) ≥ L(x) + c` and `U(x) ≤ U(y) - c`
3. When two expressions conflict:
   If `offset_p + size_p ≤ offset_q` is provably true, register a **disjointness fact**.
4. Detect contradictory cycles (negative-weight cycles in the negated system) and mark the originating constraints as infeasible for diagnostics.
5. Continue propagation until a fixed point.

Practical considerations:

* The solver caches shortest paths for repeated queries in interprocedural contexts.
* Worklists are partitioned per strongly connected component to reduce redundant relaxations.

This is a *Bellman–Ford–like relaxation* over a directed constraint graph,
but runs sparsely due to vSSA def-use structure.

=========================================
5. Alias Disambiguation (Final Decision)
=========================================

For two memory objects with computed offsets:

*Let:*

* P ranges over `[Lp, Up]`
* Q ranges over `[Lq, Uq]`
* And their sizes: `size(P)` and `size(Q)`

SRAA proves **NoAlias** if any of the following hold:

1. ::
       Up + size(P) ≤ Lq
2. ::
       Uq + size(Q) ≤ Lp
3. Strict relation holds::
       Up < Lq   OR   Uq < Lp

Example:

* `p = base + i * 4`, `0 ≤ i < n`
* `q = base + (n + j) * 4`, `0 ≤ j < m`
* `size(p)` = `size(q)` = 4

RA yields `Up = 4(n-1)` and `Lq = 4n`. Therefore `Up + size(p) = 4n ≤ Lq`, and SRAA reports `NoAlias`.

If none of the strict inequalities are provable, SRAA returns **MayAlias**.

Notes:

* Proofs use purely symbolic comparisons.
* No heap-shape or points-to analysis is required.
* SRAA is *context-insensitive* but *interprocedural*:
  formal-actual parameters are connected via synthetic φ-like merges.

============================================
6. PDG Memory-Node Reduction (Applicability)
============================================

For dependence-graph construction (as used in the paper’s applicability study):

If two memory objects are proven disjoint,
the PDG pass merges or eliminates corresponding memory nodes.

Optimization details:

* Proven `NoAlias` edges are annotated so later passes can skip building memory dependences entirely.
* When only one direction is proven (e.g., `p` before `q`), the PDG records a directed reduction, shrinking the transitive closure.

The result is fewer nodes and fewer dependence edges,
which improves precision in downstream analyses.

======================
Summary of Properties
======================

* **Sparse** due to vSSA/SSI transformation.
* **Linear** constraint solving on affine strict inequalities.
* **Provably sound** for proving disjointness (NoAlias).
* **Flow-sensitive** but **context-insensitive**.
* **Explainable**: every `NoAlias` result carries a justification chain for debugging.
* Produces **strict (<, ≤) disjointness proofs**, not approximate heuristics.
* Complements and composes with LLVM’s baseline alias analyses.

================
End of document.
================