Symbolic Finite Automata
The aria.automata.symautomata module provides a comprehensive implementation of symbolic finite automata and related computational models. This module supports various types of automata including Deterministic Finite Automata (DFA), Symbolic Finite Automata (SFA), and Pushdown Automata (PDA).
Overview
Symbolic finite automata extend traditional finite automata by allowing transitions to be labeled with predicates over potentially infinite alphabets, rather than just individual symbols. This makes them particularly useful for string analysis, regular expression processing, and formal verification tasks.
Core Components
DFA (Deterministic Finite Automata)
The DFA implementation provides multiple backends for maximum compatibility:
This module contains the DFA implementation
- aria.automata.symautomata.dfa.bfs(graph, start) str | None[source]
Finds the shortest string using BFS :param graph: The DFA states :type graph: DFA :param start: The DFA initial state :type start: DFA state
- Returns:
The shortest string
- Return type:
str
Key Features:
Multiple Backend Support: Automatically selects the best available backend:
pywrapfst: OpenFST Python bindings (preferred)fst: Alternative FST libraryPure Python implementation (fallback)
Core Operations:
Minimization and determinization
Boolean operations (union, intersection, complement, difference)
Regular expression conversion
String acceptance testing
Shortest string generation
Example Usage:
from aria.automata.symautomata.dfa import DFA
from aria.automata.symautomata.alphabet import createalphabet
# Create a DFA with default alphabet
dfa = DFA(createalphabet())
# Add states and transitions
dfa.add_arc(0, 1, 'a')
dfa.add_arc(1, 2, 'b')
dfa[2].final = True
# Test string acceptance
result = dfa.consume_input("ab") # Returns True
# Convert to regular expression
regex = dfa.to_regex()
# Find shortest accepted string
shortest = dfa.shortest_string()
SFA (Symbolic Finite Automata)
- class aria.automata.symautomata.sfa.SFA(alphabet: Sequence[object] | None = None, predicate_factory: Callable[[], Predicate] | None = None)[source]
Bases:
object
- class aria.automata.symautomata.sfa.SFAArc(src_state_id: int, dst_state_id: int, guard_p: Predicate, term: object | None = None)[source]
Bases:
object
- class aria.automata.symautomata.sfa.SetPredicate(initializer: Iterable[object], universe: Iterable[object] | None = None)[source]
Bases:
Predicate
- class aria.automata.symautomata.sfa.Z3Predicate(symbol: Any, formula: Any, universe: Any | None = None)[source]
Bases:
Predicate- classmethod bitvec(width: int, formula: Any | None = None, name: str = 'sym', universe: Any | None = None) Z3Predicate[source]
Symbolic Finite Automata extend traditional DFAs by using predicates on transitions instead of individual symbols. This allows for more compact representations when dealing with large or infinite alphabets.
Key Components:
Predicates: Abstract conditions that determine transition validity
SetPredicate: Concrete implementation using character sets
Z3Predicate: Solver-backed implementation for symbolic domains
SFA States and Arcs: Extended state and transition structures
Operational Semantics:
SFA.accepts(...)tracks a set of current states, so acceptance behaves like an NFA with guarded transitions.SFA.determinize()partitions satisfiable guard regions and is used internally bycomplete(),complement(), andconcretize().SetPredicate.negate()requires a finite universe.SFA.complete()andSFA.complement()require either a finite alphabet or apredicate_factorythat can create a predicate over the intended symbolic universe.SFA.save()andSFA.load()are finite-alphabet workflows; they do not serialize symbolic formulas.
Concrete Workflow Bridges:
SFA.from_acceptor(dfa)lifts a concrete DFA-style acceptor into an SFA with singletonSetPredicateguards.SFA.to_regex()reuses the existing DFA-to-regex workflow by concretizing first and then delegating toDFA.to_regex().
Example Usage:
from aria.automata.symautomata import SFA, SetPredicate, Z3Predicate
from aria.automata.symautomata.dfa import DFA
import string
import z3
# Create SFA for pattern matching
sfa = SFA(list(string.ascii_lowercase))
# Add transitions with predicates
# Accept any lowercase letter except 'x'
not_x = SetPredicate([c for c in string.ascii_lowercase if c != 'x'])
sfa.add_arc(0, 0, not_x)
# Accept 'x' to go to final state
x_only = SetPredicate(['x'])
sfa.add_arc(0, 1, x_only)
sfa.states[1].final = True
# Convert to concrete DFA
dfa = sfa.concretize()
# Lift an existing DFA into SFA and reuse regex conversion
concrete = DFA(["a", "b"])
concrete.add_arc(0, 1, "a")
concrete[1].final = True
lifted = SFA.from_acceptor(concrete)
regex = lifted.to_regex()
# Use solver-backed predicates
sym = z3.BitVec("sym", 8)
symbolic = SFA([0, 1, 2, 3], predicate_factory=lambda: Z3Predicate(sym, z3.BoolVal(False), z3.ULT(sym, z3.BitVecVal(4, 8))))
symbolic.add_arc(0, 1, Z3Predicate(sym, z3.ULT(sym, z3.BitVecVal(2, 8)), z3.ULT(sym, z3.BitVecVal(4, 8))))
symbolic[1].final = True
PDA (Pushdown Automata)
This module contains the PDA implementation
- class aria.automata.symautomata.pda.PDA(alphabet)[source]
Bases:
PythonPDAThis is the structure for a PDA
Pushdown Automata extend finite automata with a stack, enabling recognition of context-free languages.
Features:
Context-free language recognition
Stack-based computation model
Integration with CFG (Context-Free Grammar) processing
Utility Modules
Alphabet Management
This module configures that alphabet.
- aria.automata.symautomata.alphabet.createalphabet(alphabetinput: str | None = None) List[str][source]
Creates a sample alphabet containing printable ASCII characters
The alphabet module provides flexible alphabet creation and management:
from aria.automata.symautomata.alphabet import createalphabet
# Default printable ASCII alphabet
alpha1 = createalphabet()
# Custom range (Unicode code points)
alpha2 = createalphabet("65-91,97-123") # A-Z, a-z
# From file
alpha3 = createalphabet("my_alphabet.txt")
Regular Expression Processing
This module transforms a pyfst DFA to regular expressions using the Brzozowski Algebraic Method
- class aria.automata.symautomata.regex.Regex(input_fst_a, alphabet=None)[source]
Bases:
objectThis class transforms a pyfst DFA to regular expressions using the Brzozowski Algebraic Method
- aria.automata.symautomata.regex.main()[source]
Testing function for DFA brzozowski algebraic method Operation
The regex module implements the Brzozowski algebraic method for converting DFAs to regular expressions:
from aria.automata.symautomata.regex import Regex
# Convert DFA to regex
converter = Regex(my_dfa)
regex_string = converter.get_regex()
Brzozowski Algorithm
Implements Brzozowski’s algebraic method for regular expression derivation:
from aria.automata.symautomata.brzozowski import Brzozowski
# Apply Brzozowski algorithm
brz = Brzozowski(input_dfa)
regex_map = brz.get_regex()
Advanced Features
Context-Free Grammar Support
This modules generates a string from a CFG
- class aria.automata.symautomata.cfggenerator.CFGGenerator(cfgr=None, optimized=1, splitstring=0, maxstate=0)[source]
Bases:
objectThis class generates a string from a CFG
- bfs_queue = None
- generate()[source]
Generates a new random string from the start symbol :param None:
- Returns:
The generated string
- Return type:
str
- grammar = None
- maxstate = 0
- resolved = None
- class aria.automata.symautomata.cfggenerator.CNFGenerator(grammar_rules)[source]
Bases:
objectUse Chomsky to transform to CNF
This module generates a PDA using a CFG as input
- class aria.automata.symautomata.cfgpda.CfgPDA(alphabet: List[str] | None = None)[source]
Bases:
objectCreate a PDA from a CFG
The module includes comprehensive support for context-free grammars:
CFG to CNF conversion: Transform context-free grammars to Chomsky Normal Form
CFG to PDA conversion: Convert grammars to equivalent pushdown automata
String generation: Generate strings from CFG rules
Flex Integration
Integration with Flex (Fast Lexical Analyzer) for converting lexical specifications to finite automata:
from aria.automata.symautomata.flex2fst import Flexparser
# Parse Flex file to DFA
parser = Flexparser(["a", "b", "c"])
dfa = parser.yyparse("lexer.l")
Implementation Backends
The module provides multiple implementation backends for different use cases:
Pure Python Backend
This module performs all basic DFA operations, without pyfst
- class aria.automata.symautomata.pythondfa.DFAArc(srcstate_id, nextstate_id, ilabel=None)[source]
Bases:
objectThe DFA transition
- class aria.automata.symautomata.pythondfa.PythonDFA(alphabet=None)[source]
Bases:
objectA DFA implementation that uses the same interface with python symautomata
- add_arc(src, dst, char)[source]
Adds a new Arc :param src: The source state identifier :type src: int :param dst: The destination state identifier :type dst: int :param char: The character for the transition :type char: str
- Returns:
None
- complement(alphabet)[source]
Returns the complement of DFA :param alphabet: The input alphabet :type alphabet: list
- Returns:
None
- consume_input(inp)[source]
Return True/False if the machine accepts/reject the input. :param inp: input string to be consumed :type inp: str
- Returns:
- A true or false value depending on if the DFA
accepts the provided input
- Return type:
bool
- cross_product(dfa_2, accept_method)[source]
A generalized cross-product constructor over two DFAs. The third argument is a binary boolean function f; a state (q1, q2) in the final DFA accepts if f(A[q1],A[q2]), where A indicates the acceptance-value of the state. :param dfa_2: The second dfa :param accept_method: The boolean action
- Returns:
None
- determinize()[source]
Transforms a Non Deterministic DFA into a Deterministic :param None:
- Returns:
The resulting DFA
- Return type:
DFA
Creating an equivalent DFA is done using the standard algorithm. A nice description can be found in the book: Harry R. Lewis and Christos H. Papadimitriou. 1998. E print target_dfa_statelements of the Theory of Computation.
- difference(other)[source]
Performs the Diff operation between two atomata :param other: The other DFA that will be used
for the difference operation
- Returns:
The resulting DFA
- Return type:
DFA
- fixminimized(alphabet)[source]
After pyfst minimization, all unused arcs are removed, and all sink states are removed. However this may break compatibility. :param alphabet: The input alphabet :type alphabet: list
- Returns:
None
- hopcroft()[source]
Performs the Hopcroft minimization algorithm :param None:
- Returns:
The minimized input DFA
- Return type:
DFA
- init_from_acceptor(acceptor)[source]
Adds a sink state :param alphabet: The input alphabet :type alphabet: list
- Returns:
None
- intersect(other)[source]
Constructs an unminimized DFA recognizing the intersection of the languages of two given DFAs. :param other: The other DFA that will be used
for the intersect operation
Returns: :returns: The resulting DFA :rtype: DFA
- load(txt_fst_file_name)[source]
Save the transducer in the text file format of OpenFST. The format is specified as follows:
arc format: src dest ilabel olabel [weight] final state format: state [weight]
lines may occur in any order except initial state must be first line :param txt_fst_file_name: The input file :type txt_fst_file_name: str
- Returns:
None
- save(txt_fst_file_name)[source]
Save the machine in the openFST format in the file denoted by txt_fst_file_name. :param txt_fst_file_name: The output file :type txt_fst_file_name: str
- Returns:
None
- class aria.automata.symautomata.pythondfa.Syms[source]
Bases:
objectThe DFA accepted symbols
- aria.automata.symautomata.pythondfa.TropicalWeight(param)
Returns the emulated fst TropicalWeight :param param: The input :type param: str
- Returns:
The arc weight
- Return type:
bool
- aria.automata.symautomata.pythondfa.tropical_weight(param)[source]
Returns the emulated fst TropicalWeight :param param: The input :type param: str
- Returns:
The arc weight
- Return type:
bool
Advantages: No external dependencies, full Python compatibility
Use cases: Development, testing, educational purposes
Features: Complete DFA operations, Hopcroft minimization
OpenFST Backend
Advantages: High performance, industry-standard library
Requirements: OpenFST library with Python bindings
Features: Optimized operations, large-scale automata support
String Analysis
This module retrieves a simple string from a PDA using the state removal method
- class aria.automata.symautomata.pdastring.PdaString[source]
Bases:
objectRetrieves a string from a PDA
Advanced string analysis capabilities using state removal algorithms:
State removal: Systematic elimination of states to derive regular expressions
Path analysis: Trace execution paths through automata
String generation: Generate strings accepted by automata
Performance Considerations
Backend Selection
The module automatically selects the best available backend:
pywrapfst: Preferred for performance-critical applications
fst: Alternative FST implementation
pythondfa: Fallback pure Python implementation
For optimal performance:
Install OpenFST with Python bindings for large automata
Use the pure Python backend for small automata or development
Consider alphabet size when choosing predicates vs. explicit transitions
Memory Usage
Symbolic representations: Use SFA for large alphabets
Minimization: Apply minimization to reduce state count
Alphabet optimization: Use custom alphabets to reduce memory footprint
Common Patterns
Pattern Matching
# Create DFA for email validation pattern
email_dfa = DFA()
# ... build email validation automaton
# Test email addresses
valid = email_dfa.consume_input("user@domain.com")
Language Operations
# Language intersection
result_dfa = dfa1.intersect(dfa2)
# Language union
union_dfa = dfa1.union(dfa2)
# Language complement
complement_dfa = dfa1.complement(alphabet)
# Language difference
diff_dfa = dfa1.difference(dfa2)
Regular Expression Conversion
# Convert automaton to regex
from aria.automata.symautomata.regex import Regex
converter = Regex(my_dfa)
regex_pattern = converter.get_regex()
Error Handling
The module includes robust error handling:
Import errors: Graceful fallback between backends
Invalid operations: Clear error messages for malformed automata
Resource limits: Appropriate handling of large automata
Troubleshooting
Common Issues
Backend not available: Install required dependencies (OpenFST, pywrapfst)
Memory issues: Use minimization and appropriate alphabet sizes
Performance problems: Consider backend selection and automaton size
Dependencies
Required: Python 3.6+
Optional: OpenFST library, pywrapfst, networkx (for visualization)
Development: pytest, sphinx (for documentation)
API Reference
For detailed API documentation, see the individual module documentation above. The main entry points are:
DFA: Main deterministic finite automaton classSFA: Symbolic finite automaton classPDA: Pushdown automaton classcreatealphabet(): Alphabet creation utilityRegex: Regular expression conversion utility