RegexParser is a PHP 8.2+ library that treats regular expressions as code.
Unlike simple wrappers around preg_match, RegexParser implements a complete compiler pipeline (Lexer β Parser β AST) and an Automata-based Logic Solver (AST β NFA β DFA).
This architecture allows for advanced static analysis:
- Linting: Detect redundancy, useless flags, and optimizations.
- Safety: Statically detect catastrophic backtracking (ReDoS).
- Logic: Mathematically compare patterns (Intersection, Equivalence, Subset).
Built for learning, validation, and robust tooling in PHP projects.
If you are new to regex, start with the Regex Tutorial. If you want a short overview, see the Quick Start Guide.
# Install the library
composer require yoeunes/regex-parser
# Try the CLI
vendor/bin/regex explain '/\d{4}-\d{2}-\d{2}/'- ποΈ Deep Parsing: Parse
/pattern/flagsinto a structured, typed AST. - π§ Logic Solver: Mathematically compare two regexes using NFA/DFA transformation. Detect route conflicts and validate security subsets.
- π‘οΈ ReDoS Analysis: Analyze potential catastrophic backtracking risks structure-wise.
- π§Ή Linter: Clean up legacy code (useless flags, redundant groups) via the CLI.
- π Explanation: Explain patterns in plain English.
- π§ Visitor API: A flexible API for building custom regex tooling.
RegexParser separates what it can guarantee from what is heuristic:
- Guaranteed: parsing, AST structure, error offsets, and syntax validation for the targeted PHP/PCRE version.
- Heuristic: ReDoS analysis is structural and conservative; treat it as potential risk unless confirmed.
- Context matters: PCRE version, JIT, and backtrack/recursion limits change practical impact.
If you believe a pattern is exploitable:
- Run confirmed mode and capture a bounded, reproducible PoC.
- Include the pattern, input lengths, timings, JIT setting, and PCRE limits.
- Verify impact in the real code path before filing a security issue.
See SECURITY.md for reporting channels.
These techniques reduce backtracking but can change matching behavior. Always validate with tests.
/(a+)+$/ -> /a+$/ (semantics often preserved, but verify captures)
/(a+)+$/ -> /a++$/ (possessive, no backtracking)
/(a|aa)+/ -> /a+/ (only if alternation is redundant)
/(a|aa)+/ -> /(?>a|aa)+/ (atomic, avoids backtracking)
Regex::parse()splits the literal into pattern and flags.- The lexer produces a token stream.
- The parser builds an AST (
RegexNode). - Visitors walk the AST to validate, explain, analyze, or transform.
For the full architecture, see docs/ARCHITECTURE.md.
# Parse and validate a pattern
vendor/bin/regex parse '/^hello world$/'
# Get plain English explanation
vendor/bin/regex explain '/\d{4}-\d{2}-\d{2}/'
# Check for potential ReDoS risk (theoretical by default)
vendor/bin/regex analyze '/(a+)+$/'
# Colorize pattern for better readability
vendor/bin/regex highlight '/\d+/'
# Lint your entire codebase
vendor/bin/regex lint src/use RegexParser\Regex;
use RegexParser\ReDoS\ReDoSMode;
$regex = Regex::create([
'runtime_pcre_validation' => true,
]);
// Parse a pattern into AST
$ast = $regex->parse('/^hello world$/i');
// Validate pattern safety
$result = $regex->validate('/(?<=test)foo/');
if (!$result->isValid()) {
echo $result->getErrorMessage();
}
// Check for ReDoS risk (theoretical by default)
$analysis = $regex->redos('/(a+)+$/');
echo $analysis->severity->value; // 'critical', 'safe', etc.
// Optional: attempt bounded confirmation
$confirmed = $regex->redos('/(a+)+$/', mode: ReDoSMode::CONFIRMED);
echo $confirmed->isConfirmed() ? 'confirmed' : 'theoretical';
// Get human-readable explanation
echo $regex->explain('/\d{4}-\d{2}-\d{2}/');RegexParser integrates with common PHP tooling:
- Symfony bundle: docs/guides/cli.md
- PHPStan:
vendor/yoeunes/regex-parser/extension.neon - GitHub Actions:
vendor/bin/regex lintin your CI pipeline
RegexParser ships lightweight benchmark scripts in benchmarks/ to track parser, compiler, and formatter throughput.
- Run formatter benchmarks:
php benchmarks/benchmark_formatters.php - Run all benchmarks:
for file in benchmarks/benchmark_*.php; do echo "Running $file"; php "$file"; echo; done
Start here:
Key references:
Contributions are welcome! See CONTRIBUTING.md to get started.
# Set up development environment
composer install
# Run tests
composer phpunit
# Check code style
composer phpcs
# Run static analysis
composer phpstanReleased under the MIT License.
If you run into issues or have questions, please open an issue on GitHub: https://github.com/yoeunes/regex-parser/issues.
