Regular Expression Denial of Service Overview
Regular Expression Denial of Service Overview
Regular Expression Denial of Service (ReDoS) is a vulnerability that occurs when a regular expression contains patterns that can cause exponential or catastrophic backtracking. When these regular expressions process maliciously crafted input, they can consume excessive CPU resources, potentially leading to denial of service.The root cause of ReDoS is the backtracking behavior in regex engines. When a regex engine attempts to match a pattern but encounters a situation where multiple paths could be taken, it may need to try all possible paths before determining that a match is impossible. With carefully crafted input, this can lead to exponential time complexity.Preventing ReDoS requires careful design of regular expressions, avoiding vulnerable patterns, and implementing proper timeouts and input validation.
Catastrophic Backtracking
Catastrophic Backtracking
- Nested repetition quantifiers (e.g.,
(a+)+
) - Overlapping character classes with repetition
- Multiple optional groups with repetition
- Alternation with shared prefixes or suffixes
- Greedy quantifiers followed by similar patterns
- Avoid nested repetition quantifiers
- Use atomic groups or possessive quantifiers when available
- Limit the use of backtracking by using non-capturing groups
- Implement input length validation before regex processing
- Consider using regex engines with linear time guarantees
Vulnerable Regex Patterns
Vulnerable Regex Patterns
- Avoid nested repetition like
(a+)+
or(a*b*)*
- Be cautious with alternation that has common prefixes like
(ab|abc)+
- Avoid backreferences in combination with repetition
- Use atomic groups
(?>...)
when available (in languages that support them) - Use possessive quantifiers
a++
when available (in languages that support them) - Consider using character classes
[a]
instead of single characters with quantifiers - Implement proper input validation before applying regex
Timeout Mechanisms
Timeout Mechanisms
- Use asynchronous processing or separate threads for regex operations
- Set appropriate timeout values based on expected regex complexity
- Implement proper error handling for timeouts
- Consider using worker threads or child processes for isolation
- Fail closed (reject input) when timeouts occur
- Log timeout incidents for monitoring and alerting
- Consider implementing circuit breakers for regex operations
- Test timeout mechanisms with known problematic inputs
Alternative Regex Engines
Alternative Regex Engines
- Google’s RE2: Guarantees linear time complexity (available for multiple languages)
- re2j: Java port of RE2
- node-re2: Node.js binding for RE2
- Rust’s regex crate: Guarantees linear time complexity
- They may not support all features of traditional regex engines (e.g., backreferences)
- There may be syntax differences or compatibility issues
- Performance characteristics may differ for non-pathological cases
- Implementation and integration may require additional dependencies
- Testing is essential to ensure equivalent functionality
Input Validation Strategies
Input Validation Strategies
- Validate input type and structure before applying regex
- Implement input length limits appropriate to the expected data
- Use simpler, faster checks before complex regex patterns
- Consider breaking complex regex patterns into multiple simpler patterns
- Validate against a whitelist of allowed values when possible
- Implement proper error handling for invalid input
- Log validation failures for monitoring and alerting
- Consider using input normalization before validation
Testing for ReDoS Vulnerabilities
Testing for ReDoS Vulnerabilities
- Test regex patterns with inputs of increasing length
- Measure execution time to detect exponential behavior
- Use automated tools designed to detect vulnerable patterns
- Implement regex testing as part of the CI/CD pipeline
- Use fuzzing techniques to generate potentially problematic inputs
- Test with known attack patterns for common regex vulnerabilities
- Consider using static analysis tools that can identify vulnerable patterns
- Benchmark regex performance against expected time complexity
Static Analysis for ReDoS
Static Analysis for ReDoS
- safe-regex (JavaScript): Identifies potentially vulnerable patterns
- vuln-regex-detector (JavaScript): Analyzes regex vulnerability with attack string generation
- rexploitable (Python): Analyzes regex patterns for vulnerabilities
- RegexStaticAnalysis (Java): Static analysis for Java regex patterns
- RXXR2: Research tool for regex vulnerability analysis
- Early detection of vulnerable patterns
- Integration into development workflows and CI/CD pipelines
- Automated scanning of codebases
- Education of developers about vulnerable patterns
- Reduced risk of deploying vulnerable code
ReDoS Prevention Checklist
ReDoS Prevention Checklist
- Design regex patterns carefully to avoid vulnerable constructs
- Implement proper input validation before applying regex
- Use timeout mechanisms to limit resource consumption
- Consider alternative regex engines with linear time guarantees
- Test regex patterns for performance issues
- Use static analysis tools to identify vulnerable patterns
- Monitor regex execution in production
- Keep regex-related libraries and tools updated