SAT Instance Feature Extraction

aria.bool.features implements SATzilla-style feature extraction for CNF formulas. It computes 40+ structural, statistical, and probing-based features that characterise SAT instances, useful for algorithm selection and machine-learning applications.

Directory structure

aria/bool/features/
├── sat_instance.py      # High-level API: SATInstance class
├── base_features.py     # Core feature extraction (size, graph, balance)
├── active_features.py   # Active feature computation
├── balance_features.py  # Positive/negative literal balance statistics
├── graph_features.py    # Variable-clause and variable-variable graph features
├── dpll.py              # DPLL probing (unit propagation at various depths)
├── parse_cnf.py         # CNF file parser
├── enums.py             # Feature name enumerations
├── array_stats.py       # Statistical helpers (mean, std, entropy, …)
└── stopwatch.py         # Timing utility

Key class

  • SATInstance – main entry point; loads a CNF file and exposes feature extraction via its attributes and helper methods.

Feature categories

  • Size features – number of clauses, variables, clause-to-variable ratio.

  • Variable-clause graph – degree statistics (mean, std, coeff. of variation, min, max), entropy.

  • Variable graph – co-occurrence graph statistics.

  • Balance features – positive/negative literal ratios per variable and clause.

  • Horn clause proximity – fraction of Horn and anti-Horn clauses.

  • DPLL probing – unit propagation counts at decision depths 1–10 (dpll.py).

  • Structural features – power-law exponent, modularity, fractal dimension.

  • Graph features – statistics over 8 weighted bipartite and primal graphs.

Programmatic usage

from aria.bool.features.sat_instance import SATInstance

instance = SATInstance("benchmark.cnf")
# Access extracted features as a dictionary
features = instance.features_dict
print(features["nVars"], features["nClauses"])