Question 1

What is a regular expression?

Accepted Answer

A regular expression is a formal notation for describing patterns in strings using concatenation, alternation (|), and Kleene star (*). Formally, they denote exactly the regular languages — the same class recognized by finite automata. In practice, programming language regex engines add extensions (backreferences, lookahead) that go beyond regular languages.

Question 2

How does a regular expression become an automaton?

Accepted Answer

Thompson's construction (1968) converts any regular expression to an NFA with at most 2n states, where n is the pattern length. Each regex operator maps to a simple NFA fragment: concatenation chains fragments, alternation creates a branch, and Kleene star adds a loop. The resulting NFA can be simulated directly or converted to a DFA.

Question 3

What is the difference between NFA and DFA regex matching?

Accepted Answer

NFA-based matching (Thompson/Pike) tracks a set of active states simultaneously, running in O(nm) time for pattern length m and string length n. DFA-based matching (like RE2) precompiles to a DFA for O(n) matching but may take exponential time/space to build. Backtracking engines (Perl, Python) can take exponential time on pathological patterns.

Question 4

Why can regular expressions cause performance problems?

Accepted Answer

Backtracking regex engines try paths one at a time and backtrack on failure. Patterns with nested quantifiers like (a+)+ can cause catastrophic backtracking — exponential time on non-matching inputs. This is a real security concern (ReDoS attacks). NFA-based engines like RE2 avoid this by design, maintaining the O(nm) guarantee.

Regular Expression Visualizer: Watch NFA Pattern Matching Step by Step

Formula

Patterns as Machines

Thompson's Construction

NFA Simulation

The Backtracking Trap

FAQ

Sources

Embed

Regular Expression Visualizer: Watch NFA Pattern Matching Step by Step

Formula

Patterns as Machines

Thompson's Construction

NFA Simulation

The Backtracking Trap

FAQ

Sources

Other simulations: Automata Theory & Formal Languages

Cellular Automata & Emergent Complexity

Deterministic Finite Automaton (DFA)

Pushdown Automaton & Context-Free Parsing

Turing Machine & Universal Computation

Embed