Symbolic Abstraction
======================

Symbolic abstraction is a technique that computes the best abstract domain element
that overapproximates a given formula in first-order logic. It serves as a bridge between
concrete program analysis and efficient abstract reasoning, enabling scalable verification
and analysis of complex systems.

This page now subsumes the older `symabs` page and combines package-level
background with the verification-oriented `efmc --engine symabs` workflow.

The key idea is to transform a concrete logical formula into an abstract representation
that captures the essential behavior while being more amenable to automated analysis.
This abstraction preserves important properties (like satisfiability) while discarding
unnecessary details that would make analysis computationally expensive.


======
Overview
======

Aria's symbolic abstraction module (``aria/symabs``) provides a comprehensive suite
of abstraction techniques for program analysis and verification. The module is organized
around several key approaches:

1. **Abstract Interpretation-based Symbolic Abstraction** (``ai_symabs``)
2. **Optimization Modulo Theory-based Symbolic Abstraction** (``omt_symabs``)
3. **Predicate Abstraction** (``predicate_abstraction``)
4. **Model Counting-based Abstract Interpretation** (``mcai``)
5. **Congruence Closure-based Abstraction** (``congruence``)
6. **Range Set SAT-based Abstraction** (``rangeset_sat``)

Each approach offers different trade-offs between precision, scalability, and applicability
to different problem domains.


========================================
EFMC Symbolic Abstraction Prover
========================================

The EFMC symbolic abstraction prover uses symbolic abstraction as a verification
engine that computes a fixpoint in a chosen abstract domain.

At a high level it:

1. Starts with the initial condition as the current invariant
2. Computes the strongest consequence of the current invariant conjoined with
   the transition relation in the selected abstract domain
3. Joins that result with the previous invariant
4. Repeats until a fixpoint is reached
5. Checks that the resulting invariant implies the safety property

Supported EFMC domains include:

- ``interval`` for numeric and bit-vector interval abstraction
- ``bits`` for full bit-level tracking
- ``known_bits`` for lighter-weight bit-pattern reasoning

Command-line usage:

.. code-block:: bash

   efmc --engine symabs --symabs-domain interval --lang chc --file program.smt2
   efmc --engine symabs --symabs-domain known_bits --lang chc --file program.smt2

Programmatic usage:

.. code-block:: python

   from aria.efmc.engines.symabs import SymbolicAbstractionProver
   from aria.efmc.sts import TransitionSystem

   sts = TransitionSystem(...)
   prover = SymbolicAbstractionProver(sts)
   prover.domain = "interval"
   result = prover.solve()

   if result.is_safe:
       print(result.invariant)


===============
Core Concepts
===============

Abstract Domain
---------------

An abstract domain defines the lattice structure and operations for abstract elements:

.. code-block:: python

   class AbstractDomain:
       def join(self, elements):  # Least upper bound
       def meet(self, elements):  # Greatest lower bound
       def gamma_hat(self, alpha):  # Concretization function
       def model(self, formula):   # Find concrete model

The ``gamma_hat`` function translates an abstract element back to a concrete formula,
while the ``join`` operation computes the least upper bound in the abstract lattice.

Concretization and Abstraction
-------------------------------

- **Concretization** (γ̂): Maps abstract elements to concrete formulas
- **Abstraction**: Maps concrete formulas to abstract elements
- **Galois Connection**: The pair (α, γ̂) forms a Galois connection if α(γ̂(α̂)) = α̂

Precision Metrics
----------------

The quality of an abstraction is measured by:

- **Precision**: How closely the abstract formula approximates the concrete one
- **Scalability**: Computational cost of abstraction computation
- **Expressiveness**: Ability to capture important program properties


============================
Abstract Interpretation (ai_symabs)
============================

The ``ai_symabs`` submodule implements classic abstract interpretation domains based on
Thakur's PhD thesis. It provides a framework for computing symbolic abstractions using
various abstract domains.

Supported Abstract Domains
--------------------------

**Interval Domain**
~~~~~~~~~~~~~~~~~~~

The interval domain represents variable ranges using lower and upper bounds:

.. code-block:: python

   from aria.symabs.ai_symabs.domains.interval.domain import IntervalDomain

   # Create interval domain for variables ['x', 'y']
   domain = IntervalDomain(['x', 'y'])

   # Abstract state represents: x ∈ [0, 10], y ∈ [-5, 5]
   # This concretizes to: 0 ≤ x ∧ x ≤ 10 ∧ -5 ≤ y ∧ y ≤ 5

**Sign Domain**
~~~~~~~~~~~~~~~

The sign domain tracks the sign of variables (negative, zero, positive):

.. code-block:: python

   from aria.symabs.ai_symabs.domains.sign.domain import SignDomain

   # Sign domain for variable 'x'
   # Possible abstract values: -, 0, +, T (top)

**Reduced Product Domain**
~~~~~~~~~~~~~~~~~~~~~~~~~~~

Combines multiple domains for better precision:

.. code-block:: python

   from aria.symabs.ai_symabs.domains.reduced_product.domain import ReducedProductDomain

   # Combine interval and sign domains
   combined = ReducedProductDomain([interval_domain, sign_domain])

Abstraction Algorithms
---------------------

**Bilateral Algorithm**
~~~~~~~~~~~~~~~~~~~~~~~

The bilateral algorithm efficiently computes abstractions by working with both the
positive and negative forms of the formula:

.. code-block:: python

   from aria.symabs.ai_symabs.domains.algorithms.bilateral import bilateral

   # Compute abstraction using bilateral algorithm
   abstract_state = bilateral(concrete_formula, domain)

**RSY Algorithm**
~~~~~~~~~~~~~~~~

Recursive abstraction refinement algorithm that iteratively improves precision:

.. code-block:: python

   from aria.symabs.ai_symabs.domains.algorithms.rsy import rsy

   # Compute abstraction with RSY algorithm
   abstract_state = rsy(concrete_formula, domain)


============================
OMT-based Symbolic Abstraction (omt_symabs)
============================

The ``omt_symabs`` submodule uses Optimization Modulo Theory (OMT) solvers to compute
optimal abstractions. This approach leverages modern OMT engines like Z3's optimizer
to find the best abstract representation.

Key Features
------------

- **Optimal Abstractions**: Computes the most precise abstraction within the domain
- **Multiple Domains**: Supports intervals, zones, and octagons
- **Theory Support**: Handles both linear arithmetic (LIA/LRA) and bit-vectors

Linear Arithmetic Abstraction
----------------------------

.. code-block:: python

   from aria.symabs.omt_symabs.lira_symbolic_abstraction import LIRASymbolicAbstraction

   # Create LIRA symbolic abstraction
   symabs = LIRASymbolicAbstraction()

   # Initialize from SMT formula
   symabs.init_from_formula(formula)

   # Compute interval abstraction
   interval_abs = symabs.compute_interval_abstraction()

   # Compute zone abstraction
   zone_abs = symabs.compute_zone_abstraction()

Bit-Vector Abstraction
---------------------

.. code-block:: python

   from aria.symabs.omt_symabs.bv_symbolic_abstraction import BVSymbolicAbstraction

   # Create bit-vector symbolic abstraction
   bv_symabs = BVSymbolicAbstraction()

   # Initialize with bit-vector formula
   bv_symabs.init_from_formula(bv_formula)

   # Compute abstraction
   abstract_state = bv_symabs.compute_abstraction()


===============================
Predicate Abstraction (predicate_abstraction)
===============================

Predicate abstraction computes the strongest Boolean combination of given predicates
that is entailed by the concrete formula. This approach is particularly effective for
software verification tasks.

Core Algorithm
--------------

The predicate abstraction algorithm works by:

1. Finding satisfying models of the concrete formula
2. Evaluating predicates in each model
3. Constructing the strongest Boolean formula consistent with the evaluations

.. code-block:: python

   from aria.symabs.predicate_abstraction.predicate_abstraction import compute_predicate_abstraction

   # Define predicates
   predicates = [x > 0, y > 0, x + y < 10]

   # Concrete formula: x > 5 ∧ y > 3 ∧ x + y < 8
   concrete_formula = z3.And(x > 5, y > 3, x + y < 8)

   # Compute predicate abstraction
   abstract_formula = compute_predicate_abstraction(concrete_formula, predicates)
   # Result: x > 0 ∧ y > 0 ∧ x + y < 10 (strongest consequence)


========================
Additional Components
========================

Model Counting-based Abstract Interpretation (mcai)
---------------------------------------------------

The ``mcai`` submodule combines model counting with abstract interpretation to provide
precision metrics and analysis for different abstract domains. This is particularly
useful for:

- Evaluating abstraction quality
- Comparing different abstract domains
- Bit-vector formula analysis

Congruence Closure-based Abstraction (congruence)
--------------------------------------------------

The ``congruence`` submodule implements congruence closure techniques for abstraction,
particularly useful for equality reasoning and uninterpreted functions.

Range Set SAT-based Abstraction (rangeset_sat)
-----------------------------------------------

The ``rangeset_sat`` submodule provides SAT-based abstraction techniques for range
and set operations over bit-vectors, offering efficient analysis for hardware verification.


===============
Usage Examples
===============

Basic Symbolic Abstraction
--------------------------

.. code-block:: python

   import z3
   from aria.symabs.ai_symabs.domains.interval.domain import IntervalDomain
   from aria.symabs.ai_symabs.domains.algorithms.bilateral import bilateral

   # Define concrete formula: 0 ≤ x ∧ x ≤ 10 ∧ x + y = 15
   x, y = z3.Ints('x y')
   formula = z3.And(x >= 0, x <= 10, x + y == 15)

   # Create interval domain
   domain = IntervalDomain(['x', 'y'])

   # Compute abstraction
   abstract_state = bilateral(formula, domain)

   # Get concrete representation
   concrete_repr = domain.gamma_hat(abstract_state)
   print(f"Abstract state concretizes to: {concrete_repr}")

OMT-based Abstraction
--------------------

.. code-block:: python

   from aria.symabs.omt_symabs.lira_symbolic_abstraction import LIRASymbolicAbstraction

   # Create symbolic abstraction engine
   symabs = LIRASymbolicAbstraction()

   # Load formula from file
   symabs.init_from_file('example.smt2')

   # Set OMT engine type
   symabs.set_omt_engine_type('z3opt')

   # Compute optimal interval abstraction
   interval_abs = symabs.compute_interval_abstraction()
   print(f"Interval abstraction: {interval_abs}")

Predicate Abstraction Example
----------------------------

.. code-block:: python

   import z3
   from aria.symabs.predicate_abstraction.predicate_abstraction import compute_predicate_abstraction

   # Define variables and predicates
   x, y = z3.Ints('x y')
   predicates = [x >= 0, y >= 0, x + y <= 10]

   # Concrete formula representing a program path
   concrete = z3.And(x >= 5, y >= 3, x + y <= 8)

   # Compute abstraction
   abstract = compute_predicate_abstraction(concrete, predicates)
   print(f"Predicate abstraction: {abstract}")


==============
Applications
==============

Symbolic abstraction techniques find applications in various domains:

Program Verification
--------------------

- **Software Model Checking**: Abstract program states to enable scalable verification
- **Shape Analysis**: Abstract heap structures and pointer relationships
- **Numerical Program Analysis**: Abstract numerical computations for overflow/underflow detection

Security Analysis
----------------

- **Information Flow Analysis**: Abstract security levels and taint propagation
- **Cryptographic Protocol Verification**: Abstract cryptographic operations
- **Buffer Overflow Detection**: Abstract memory operations and bounds checking

Hardware Verification
--------------------

- **Bit-Vector Abstraction**: Abstract bit-level operations for hardware verification
- **Range Analysis**: Abstract value ranges for timing and power analysis
- **Equivalence Checking**: Abstract circuit behaviors for equivalence verification

Optimization and Synthesis
--------------------------

- **Compiler Optimizations**: Abstract program semantics for optimization
- **Program Synthesis**: Abstract specifications for synthesis tasks
- **Test Generation**: Abstract program paths for test case generation


==============================
Performance Considerations
==============================

Choosing the Right Approach
---------------------------

**When to use ai_symabs:**

- Fast analysis with reasonable precision
- Classic abstract domains (intervals, signs, octagons)
- Large-scale program analysis
- Resource-constrained environments

**When to use omt_symabs:**

- Need for optimal precision
- Linear arithmetic formulas
- Small to medium-sized problems
- When exact bounds are critical

**When to use predicate_abstraction:**

- Software verification tasks
- User-defined predicates available
- Boolean program analysis
- Counterexample-guided abstraction refinement (CEGAR)

Scalability Trade-offs
---------------------

- **Precision vs. Scalability**: OMT provides highest precision but scales poorly
- **Domain Selection**: Interval domains offer good precision/scalability balance
- **Algorithm Choice**: Bilateral algorithm often provides best performance
- **Parallelization**: Many abstraction algorithms can be parallelized


=======
Future Directions
=======

Research Areas
--------------

1. **Theory Development**:
   - New abstract domains for emerging computing paradigms
   - Combination methods for heterogeneous domains
   - Completeness and decidability results for abstraction algorithms

2. **Algorithm Improvement**:
   - Machine learning-guided abstraction refinement
   - Parallel and distributed abstraction algorithms
   - Incremental abstraction techniques for dynamic analysis

3. **Integration with Modern Verification**:
   - Deep learning integration for abstraction learning
   - Quantum computing abstraction techniques
   - Blockchain and smart contract verification

4. **Industrial Applications**:
   - Large-scale software verification
   - Autonomous systems analysis
   - Cyber-physical systems verification


===========
References
===========

Core Theory
-----------

- **Thakur, A. V.** (2014). *Symbolic Abstraction: Algorithms and Applications*. Ph.D. dissertation, University of Wisconsin-Madison. https://www.cs.wisc.edu/~aws/pubs/thesis.pdf

- **Cousot, P., & Cousot, R.** (1977). *Abstract interpretation: a unified lattice model for static analysis of programs by construction or approximation of fixpoints*. POPL'77.

- **Graf, S., & Saïdi, H.** (1997). *Construction of abstract state graphs with PVS*. CAV'97.

Abstract Interpretation
----------------------

- **Cousot, P., Cousot, R., & Logozzo, F.** (2013). *A parametric segmentation functor for fully automatic and scalable array content analysis*. POPL'13.

- **Miné, A.** (2006). *The octagon abstract domain*. Higher-Order and Symbolic Computation.

- **Singh, G., Püschel, M., & Vechev, M.** (2015). *Fast numerical program analysis with reinforcement learning*. CAV'15.

OMT-based Abstraction
--------------------

- **Sebastiani, R., & Trentin, P.** (2015). *OPTIMathSAT: a tool for optimization modulo theories*. CAV'15.

- **Borralleras, C., Larraz, D., Oliveras, A., Rodríguez-Carbonell, E., & Rubio, A.** (2019). *The iSAT solver*. TACAS'19.

Predicate Abstraction
---------------------

- **Graf, S., & Saïdi, H.** (1997). *Construction of abstract state graphs with PVS*. CAV'97.

- **Ball, T., Majumdar, R., Millstein, T., & Rajamani, S. K.** (2001). *Automatic predicate abstraction of C programs*. PLDI'01.

- **Henzinger, T. A., Jhala, R., Majumdar, R., & McMillan, K. L.** (2004). *Abstractions from proofs*. POPL'04.

Applications and Case Studies
----------------------------

- **Chen, Y., Wang, Y., Zhu, Z., Song, L., & Zhang, L.** (2021). *Program Analysis via Efficient Symbolic Abstraction*. OOPSLA'21.

- **Albarghouthi, A., Li, Y., & Zhang, L.** (2019). *Automating Abstract Interpretation*. VMCAI'16.

- **Gulwani, S., & Tiwari, A.** (2007). *Combining abstract interpretation with model checking*. ESEC/FSE'07.