Symbolic Finite Automata

The aria.automata.symautomata module provides a comprehensive implementation of symbolic finite automata and related computational models. This module supports various types of automata including Deterministic Finite Automata (DFA), Symbolic Finite Automata (SFA), and Pushdown Automata (PDA).

Overview

Symbolic finite automata extend traditional finite automata by allowing transitions to be labeled with predicates over potentially infinite alphabets, rather than just individual symbols. This makes them particularly useful for string analysis, regular expression processing, and formal verification tasks.

Core Components

DFA (Deterministic Finite Automata)

The DFA implementation provides multiple backends for maximum compatibility:

This module contains the DFA implementation

aria.automata.symautomata.dfa.bfs(graph, start) str | None[source]

Finds the shortest string using BFS :param graph: The DFA states :type graph: DFA :param start: The DFA initial state :type start: DFA state

Returns:

The shortest string

Return type:

str

Key Features:

  • Multiple Backend Support: Automatically selects the best available backend:

    • pywrapfst: OpenFST Python bindings (preferred)

    • fst: Alternative FST library

    • Pure Python implementation (fallback)

  • Core Operations:

    • Minimization and determinization

    • Boolean operations (union, intersection, complement, difference)

    • Regular expression conversion

    • String acceptance testing

    • Shortest string generation

Example Usage:

from aria.automata.symautomata.dfa import DFA
from aria.automata.symautomata.alphabet import createalphabet

# Create a DFA with default alphabet
dfa = DFA(createalphabet())

# Add states and transitions
dfa.add_arc(0, 1, 'a')
dfa.add_arc(1, 2, 'b')
dfa[2].final = True

# Test string acceptance
result = dfa.consume_input("ab")  # Returns True

# Convert to regular expression
regex = dfa.to_regex()

# Find shortest accepted string
shortest = dfa.shortest_string()

SFA (Symbolic Finite Automata)

class aria.automata.symautomata.sfa.Predicate[source]

Bases: object

conjunction(other: Predicate) Predicate[source]
disjunction(other: Predicate) Predicate[source]
get_witness() object[source]
is_equivalent(other: Predicate) bool[source]
is_sat(symbol: object | None = None) bool[source]
negate() Predicate[source]
set_universe(universe: Iterable[object]) None[source]
top() Predicate[source]
class aria.automata.symautomata.sfa.SFA(alphabet: Sequence[object] | None = None, predicate_factory: Callable[[], Predicate] | None = None)[source]

Bases: object

accepts(inp: Sequence[object]) bool[source]
add_arc(src: int, dst: int, guard: object, term: object | None = None) None[source]
add_state(initial: bool = False, final: bool = False) int[source]
complement() SFA[source]
complete() SFA[source]
concretize()[source]
consume_input(inp: Sequence[object]) bool[source]
copy() SFA[source]
determinize() SFA[source]
difference(other: SFA) SFA[source]
classmethod from_acceptor(acceptor: object, alphabet: Sequence[object] | None = None) SFA[source]
get_witness() List[object] | None[source]
init_from_acceptor(acceptor: object) None[source]
initial_states() List[int][source]
intersection(other: SFA) SFA[source]
is_empty() bool[source]
is_equivalent(other: SFA) bool[source]
load(txt_fst_filename: str) None[source]
save(txt_fst_filename: str) None[source]
set_final(state_id: int, final: bool = True) None[source]
symmetric_difference(other: SFA) SFA[source]
to_regex() str | None[source]
union(other: SFA) SFA[source]
class aria.automata.symautomata.sfa.SFAArc(src_state_id: int, dst_state_id: int, guard_p: Predicate, term: object | None = None)[source]

Bases: object

class aria.automata.symautomata.sfa.SFAState(sid: int | None = None)[source]

Bases: object

class aria.automata.symautomata.sfa.SetPredicate(initializer: Iterable[object], universe: Iterable[object] | None = None)[source]

Bases: Predicate

conjunction(other: Predicate) Predicate[source]
disjunction(other: Predicate) Predicate[source]
get_witness() object[source]
is_equivalent(other: Predicate) bool[source]
is_sat(symbol: object | None = None) bool[source]
negate() Predicate[source]
set_universe(universe: Iterable[object]) None[source]
class aria.automata.symautomata.sfa.Z3Predicate(symbol: Any, formula: Any, universe: Any | None = None)[source]

Bases: Predicate

classmethod bitvec(width: int, formula: Any | None = None, name: str = 'sym', universe: Any | None = None) Z3Predicate[source]
conjunction(other: Predicate) Predicate[source]
disjunction(other: Predicate) Predicate[source]
get_witness() object[source]
is_equivalent(other: Predicate) bool[source]
is_sat(symbol: object | None = None) bool[source]
negate() Predicate[source]
top() Predicate[source]
aria.automata.symautomata.sfa.main() None[source]

Symbolic Finite Automata extend traditional DFAs by using predicates on transitions instead of individual symbols. This allows for more compact representations when dealing with large or infinite alphabets.

Key Components:

  • Predicates: Abstract conditions that determine transition validity

  • SetPredicate: Concrete implementation using character sets

  • Z3Predicate: Solver-backed implementation for symbolic domains

  • SFA States and Arcs: Extended state and transition structures

Operational Semantics:

  • SFA.accepts(...) tracks a set of current states, so acceptance behaves like an NFA with guarded transitions.

  • SFA.determinize() partitions satisfiable guard regions and is used internally by complete(), complement(), and concretize().

  • SetPredicate.negate() requires a finite universe.

  • SFA.complete() and SFA.complement() require either a finite alphabet or a predicate_factory that can create a predicate over the intended symbolic universe.

  • SFA.save() and SFA.load() are finite-alphabet workflows; they do not serialize symbolic formulas.

Concrete Workflow Bridges:

  • SFA.from_acceptor(dfa) lifts a concrete DFA-style acceptor into an SFA with singleton SetPredicate guards.

  • SFA.to_regex() reuses the existing DFA-to-regex workflow by concretizing first and then delegating to DFA.to_regex().

Example Usage:

from aria.automata.symautomata import SFA, SetPredicate, Z3Predicate
from aria.automata.symautomata.dfa import DFA
import string
import z3

# Create SFA for pattern matching
sfa = SFA(list(string.ascii_lowercase))

# Add transitions with predicates
# Accept any lowercase letter except 'x'
not_x = SetPredicate([c for c in string.ascii_lowercase if c != 'x'])
sfa.add_arc(0, 0, not_x)

# Accept 'x' to go to final state
x_only = SetPredicate(['x'])
sfa.add_arc(0, 1, x_only)
sfa.states[1].final = True

 # Convert to concrete DFA
 dfa = sfa.concretize()

# Lift an existing DFA into SFA and reuse regex conversion
concrete = DFA(["a", "b"])
concrete.add_arc(0, 1, "a")
concrete[1].final = True
lifted = SFA.from_acceptor(concrete)
regex = lifted.to_regex()

# Use solver-backed predicates
sym = z3.BitVec("sym", 8)
symbolic = SFA([0, 1, 2, 3], predicate_factory=lambda: Z3Predicate(sym, z3.BoolVal(False), z3.ULT(sym, z3.BitVecVal(4, 8))))
symbolic.add_arc(0, 1, Z3Predicate(sym, z3.ULT(sym, z3.BitVecVal(2, 8)), z3.ULT(sym, z3.BitVecVal(4, 8))))
symbolic[1].final = True

PDA (Pushdown Automata)

This module contains the PDA implementation

class aria.automata.symautomata.pda.PDA(alphabet)[source]

Bases: PythonPDA

This is the structure for a PDA

diff(mmb)[source]

Automata Diff operation

shortest_string() str | None[source]

Uses BFS in order to find the shortest string :param None:

Returns:

The shortest string

Return type:

str

Pushdown Automata extend finite automata with a stack, enabling recognition of context-free languages.

Features:

  • Context-free language recognition

  • Stack-based computation model

  • Integration with CFG (Context-Free Grammar) processing

Utility Modules

Alphabet Management

This module configures that alphabet.

aria.automata.symautomata.alphabet.createalphabet(alphabetinput: str | None = None) List[str][source]

Creates a sample alphabet containing printable ASCII characters

The alphabet module provides flexible alphabet creation and management:

from aria.automata.symautomata.alphabet import createalphabet

# Default printable ASCII alphabet
alpha1 = createalphabet()

# Custom range (Unicode code points)
alpha2 = createalphabet("65-91,97-123")  # A-Z, a-z

# From file
alpha3 = createalphabet("my_alphabet.txt")

Regular Expression Processing

This module transforms a pyfst DFA to regular expressions using the Brzozowski Algebraic Method

class aria.automata.symautomata.regex.Regex(input_fst_a, alphabet=None)[source]

Bases: object

This class transforms a pyfst DFA to regular expressions using the Brzozowski Algebraic Method

checkIfBothAreNeeded(inputA, inputB)[source]
checkIfSameStateNotNeeded(state, same)[source]
cleaner(res)[source]
cleanerS(res)[source]
existsPath(initial, finalstate, visited)[source]
findWhichSymbolsFromOtherStatesConnectToMe(state)[source]
fixBrzozowskiAdvanced(found)[source]
fixBrzozowskiBackwardLoopRemoval()[source]
getMaxInternalPath(inkeys)[source]
get_regex()[source]

Generate regular expressions from DFA automaton

partitionAlphabet(alphabetLen, parsedInput)[source]
replaceAlphabet(input)[source]
replaceAlphabetMixed(input1, input2)[source]
star(input_string)[source]

Kleene star operation :param input_string: The string that the kleene star will be made :type input_string: str

Returns:

The applied Kleene star operation on the input string

Return type:

str

aria.automata.symautomata.regex.main()[source]

Testing function for DFA brzozowski algebraic method Operation

The regex module implements the Brzozowski algebraic method for converting DFAs to regular expressions:

from aria.automata.symautomata.regex import Regex

# Convert DFA to regex
converter = Regex(my_dfa)
regex_string = converter.get_regex()

Brzozowski Algorithm

Implements Brzozowski’s algebraic method for regular expression derivation:

from aria.automata.symautomata.brzozowski import Brzozowski

# Apply Brzozowski algorithm
brz = Brzozowski(input_dfa)
regex_map = brz.get_regex()

Advanced Features

Context-Free Grammar Support

This modules generates a string from a CFG

class aria.automata.symautomata.cfggenerator.CFGGenerator(cfgr=None, optimized=1, splitstring=0, maxstate=0)[source]

Bases: object

This class generates a string from a CFG

bfs_queue = None
generate()[source]

Generates a new random string from the start symbol :param None:

Returns:

The generated string

Return type:

str

grammar = None
maxstate = 0
resolved = None
class aria.automata.symautomata.cfggenerator.CNFGenerator(grammar_rules)[source]

Bases: object

Use Chomsky to transform to CNF

next_rule(left_hand_side, right_hand_side, rule_index)[source]

This module generates a PDA using a CFG as input

class aria.automata.symautomata.cfgpda.CfgPDA(alphabet: List[str] | None = None)[source]

Bases: object

Create a PDA from a CFG

yyparse(cfgfile: str, splitstring: int = 0)[source]
Parameters:
  • cfgfile (str) – The path for the file containing the CFG rules

  • splitstring (bool) – A boolean for enabling or disabling the splitting of symbols using a space

Returns:

The generated PDA

Return type:

PDA

The module includes comprehensive support for context-free grammars:

  • CFG to CNF conversion: Transform context-free grammars to Chomsky Normal Form

  • CFG to PDA conversion: Convert grammars to equivalent pushdown automata

  • String generation: Generate strings from CFG rules

Flex Integration

Integration with Flex (Fast Lexical Analyzer) for converting lexical specifications to finite automata:

from aria.automata.symautomata.flex2fst import Flexparser

# Parse Flex file to DFA
parser = Flexparser(["a", "b", "c"])
dfa = parser.yyparse("lexer.l")

Implementation Backends

The module provides multiple implementation backends for different use cases:

Pure Python Backend

This module performs all basic DFA operations, without pyfst

class aria.automata.symautomata.pythondfa.DFAArc(srcstate_id, nextstate_id, ilabel=None)[source]

Bases: object

The DFA transition

class aria.automata.symautomata.pythondfa.DFAState(sid=None)[source]

Bases: object

The DFA statess

class aria.automata.symautomata.pythondfa.PythonDFA(alphabet=None)[source]

Bases: object

A DFA implementation that uses the same interface with python symautomata

add_arc(src, dst, char)[source]

Adds a new Arc :param src: The source state identifier :type src: int :param dst: The destination state identifier :type dst: int :param char: The character for the transition :type char: str

Returns:

None

add_state()[source]

Adds a new state

complement(alphabet)[source]

Returns the complement of DFA :param alphabet: The input alphabet :type alphabet: list

Returns:

None

consume_input(inp)[source]

Return True/False if the machine accepts/reject the input. :param inp: input string to be consumed :type inp: str

Returns:

A true or false value depending on if the DFA

accepts the provided input

Return type:

bool

cross_product(dfa_2, accept_method)[source]

A generalized cross-product constructor over two DFAs. The third argument is a binary boolean function f; a state (q1, q2) in the final DFA accepts if f(A[q1],A[q2]), where A indicates the acceptance-value of the state. :param dfa_2: The second dfa :param accept_method: The boolean action

Returns:

None

define()[source]

If DFA is empty, create a sink state

determinize()[source]

Transforms a Non Deterministic DFA into a Deterministic :param None:

Returns:

The resulting DFA

Return type:

DFA

Creating an equivalent DFA is done using the standard algorithm. A nice description can be found in the book: Harry R. Lewis and Christos H. Papadimitriou. 1998. E print target_dfa_statelements of the Theory of Computation.

difference(other)[source]

Performs the Diff operation between two atomata :param other: The other DFA that will be used

for the difference operation

Returns:

The resulting DFA

Return type:

DFA

empty()[source]

Return True if the DFA accepts the empty language.

fixminimized(alphabet)[source]

After pyfst minimization, all unused arcs are removed, and all sink states are removed. However this may break compatibility. :param alphabet: The input alphabet :type alphabet: list

Returns:

None

hopcroft()[source]

Performs the Hopcroft minimization algorithm :param None:

Returns:

The minimized input DFA

Return type:

DFA

init_from_acceptor(acceptor)[source]

Adds a sink state :param alphabet: The input alphabet :type alphabet: list

Returns:

None

intersect(other)[source]

Constructs an unminimized DFA recognizing the intersection of the languages of two given DFAs. :param other: The other DFA that will be used

for the intersect operation

Returns: :returns: The resulting DFA :rtype: DFA

invert()[source]

Inverts the DFA final states

load(txt_fst_file_name)[source]

Save the transducer in the text file format of OpenFST. The format is specified as follows:

arc format: src dest ilabel olabel [weight] final state format: state [weight]

lines may occur in any order except initial state must be first line :param txt_fst_file_name: The input file :type txt_fst_file_name: str

Returns:

None

minimize()[source]

Minimizes the DFA using Hopcroft algorithm

save(txt_fst_file_name)[source]

Save the machine in the openFST format in the file denoted by txt_fst_file_name. :param txt_fst_file_name: The output file :type txt_fst_file_name: str

Returns:

None

symmetric_difference(other)[source]

Constructs an unminimized DFA recognizing the symmetric difference of the languages of two given DFAs. :param other: The other DFA that will be used

for the symmetric difference operation

Returns:

The resulting DFA

Return type:

DFA

trace_partial_input(inp)[source]

Return a list of (state, char) traversed by the inp

union(other)[source]

Constructs an unminimized DFA recognizing the union of the languages of two given DFAs. :param other: The other DFA that will be used

for the union operation

Returns:

The resulting DFA

Return type:

DFA

class aria.automata.symautomata.pythondfa.Syms[source]

Bases: object

The DFA accepted symbols

find(num)[source]

Finds a symbol based on its identifier :param num: The symbol identifier :type num: int

Returns:

The retrieved symbol

Return type:

str

items()[source]

Returns all stored symbols :param None:

Returns:

The included symbols

Return type:

dict

aria.automata.symautomata.pythondfa.TropicalWeight(param)

Returns the emulated fst TropicalWeight :param param: The input :type param: str

Returns:

The arc weight

Return type:

bool

aria.automata.symautomata.pythondfa.tropical_weight(param)[source]

Returns the emulated fst TropicalWeight :param param: The input :type param: str

Returns:

The arc weight

Return type:

bool

  • Advantages: No external dependencies, full Python compatibility

  • Use cases: Development, testing, educational purposes

  • Features: Complete DFA operations, Hopcroft minimization

OpenFST Backend

  • Advantages: High performance, industry-standard library

  • Requirements: OpenFST library with Python bindings

  • Features: Optimized operations, large-scale automata support

String Analysis

This module retrieves a simple string from a PDA using the state removal method

class aria.automata.symautomata.pdastring.PdaString[source]

Bases: object

Retrieves a string from a PDA

init(states, accepted)[source]

Initialization of the indexing dictionaries

printer()[source]

Visualizes the current state

Advanced string analysis capabilities using state removal algorithms:

  • State removal: Systematic elimination of states to derive regular expressions

  • Path analysis: Trace execution paths through automata

  • String generation: Generate strings accepted by automata

Performance Considerations

Backend Selection

The module automatically selects the best available backend:

  1. pywrapfst: Preferred for performance-critical applications

  2. fst: Alternative FST implementation

  3. pythondfa: Fallback pure Python implementation

For optimal performance:

  • Install OpenFST with Python bindings for large automata

  • Use the pure Python backend for small automata or development

  • Consider alphabet size when choosing predicates vs. explicit transitions

Memory Usage

  • Symbolic representations: Use SFA for large alphabets

  • Minimization: Apply minimization to reduce state count

  • Alphabet optimization: Use custom alphabets to reduce memory footprint

Common Patterns

Pattern Matching

# Create DFA for email validation pattern
email_dfa = DFA()
# ... build email validation automaton

# Test email addresses
valid = email_dfa.consume_input("user@domain.com")

Language Operations

# Language intersection
result_dfa = dfa1.intersect(dfa2)

# Language union
union_dfa = dfa1.union(dfa2)

# Language complement
complement_dfa = dfa1.complement(alphabet)

# Language difference
diff_dfa = dfa1.difference(dfa2)

Regular Expression Conversion

# Convert automaton to regex
from aria.automata.symautomata.regex import Regex

converter = Regex(my_dfa)
regex_pattern = converter.get_regex()

Error Handling

The module includes robust error handling:

  • Import errors: Graceful fallback between backends

  • Invalid operations: Clear error messages for malformed automata

  • Resource limits: Appropriate handling of large automata

Troubleshooting

Common Issues

  1. Backend not available: Install required dependencies (OpenFST, pywrapfst)

  2. Memory issues: Use minimization and appropriate alphabet sizes

  3. Performance problems: Consider backend selection and automaton size

Dependencies

  • Required: Python 3.6+

  • Optional: OpenFST library, pywrapfst, networkx (for visualization)

  • Development: pytest, sphinx (for documentation)

API Reference

For detailed API documentation, see the individual module documentation above. The main entry points are:

  • DFA: Main deterministic finite automaton class

  • SFA: Symbolic finite automaton class

  • PDA: Pushdown automaton class

  • createalphabet(): Alphabet creation utility

  • Regex: Regular expression conversion utility