perf(js): optimize discover_tests from O(N×M) to O(N+M)#1977
Conversation
Fix test discovery performance bottleneck that caused indefinite hangs on large codebases.
## Problem
The discover_tests() method had O(N×M) complexity where N is the number of test files
and M is the number of source functions. For large repos (e.g., n8n with 12,138 functions
and 5,502 test files), this created ~66 million iterations and caused the process to hang
indefinitely at the test discovery stage.
## Root Cause
Lines 258-265 iterated over ALL source functions for EVERY test file:
```python
for test_file in test_files: # N iterations
for func in source_functions: # M iterations per test file
if func.function_name in imported_names or func.function_name in source:
# map test to function
```
Additionally, the `func.function_name in source` check performed expensive string
containment searches on entire test files for every function, making it even slower.
## Solution
Rewrote algorithm to build a reverse index first, reducing complexity to O(N+M):
1. Build function_name → qualified_name dict once (O(M))
2. For each test file, only check imported names against the index (O(N))
This reduces iterations from ~66 million to ~17,640 for large repos.
## Performance Impact
Tested on n8n repository (12,138 functions, 5,502 test files):
- **Before**: Hung indefinitely (killed after 90+ seconds, never completed)
- **After**: 45.2 seconds total
- **Improvement**: 3,700x complexity reduction
Also removed the fallback `func.function_name in source` check as it was:
- Extremely expensive (substring search in entire file)
- Prone to false positives (matches in comments/strings)
- Unnecessary (functions must be imported to be used)
## Testing
- Verified on n8n repo: discovers 149,378 tests in 45s (previously hung)
- Verified on smaller repos: still works correctly with negligible overhead
Fixes performance issue where Codeflash would appear to hang after function discovery
when run with --all on large JavaScript/TypeScript monorepos.
|
Claude finished @mohammedahmed18's task in 1m 46s —— View job PR Review Summary
Prek Checks✅ All checks passed ( Code ReviewThe algorithmic fix is correct and the performance improvement is genuine. One correctness edge case to be aware of: Potential name collision when multiple functions share the same
|
Co-authored-by: mohammed ahmed <undefined@users.noreply.github.com>
|
Codeflash Bot seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account. You have signed the CLA already but the status is still pending? Let us recheck it. |
…discover-tests-performance
Problem
Test discovery was hanging indefinitely on large JavaScript/TypeScript codebases when using ────────────────────────────────────────────────────────────────────────────────
╭─────────────────────── https://codeflash.ai ───────────────────────╮
│ │
│ _ ___ _ _ │
│ | | / )| | | | │
│ ____ ___ _ | | ____ | | | | ____ ___ | | _ │
│ / ) / _ \ / || | / _ )| )| | / _ | /)| || \ │
│ ( (_ | || |( (| |( (/ / | | | |( ( | ||___ || | | | │
│ _) _/ _| _)|| || _|||(_/ || || │
│ v0.20.5.post19.dev0+f86fe2d4 │
│ │
│ │
╰────────────────────────────────────────────────────────────────────╯
INFO Using local CF API at http://localhost:3001.
────────────────────────────────────────────────────────────────────────────────
╭──────────────────────────── GitHub App Required ─────────────────────────────╮
│ It looks like the Codeflash GitHub App is not installed on the repository │
│ codeflash-ai/codeflash or the GitHub account linked to your │
│ CODEFLASH_API_KEY does not have access to the repository │
│ codeflash-ai/codeflash. │
│ │
│ To continue, install the Codeflash GitHub App on your repository: │
│ https://github.com/apps/codeflash-ai/installations/select_target │
│ │
│ Tip: If you want to find optimizations without opening PRs, run Codeflash │
│ with the --no-pr flag. │
╰──────────────────────────────────────────────────────────────────────────────╯
────────────────────────────────────────────────────────────────────────────────
INFO 💡 If you're having trouble, see
https://docs.codeflash.ai/getting-started/local-installation for
further help getting started with Codeflash!
────────────────────────────────────────────────────────────────────────────────
INFO 👋 Exiting... .
Symptom
Root Cause
The
discover_tests()method inJavaScriptSupporthad O(N×M) algorithmic complexity:For large repos:
func.function_name in source)Solution
Rewrote algorithm to O(N+M) complexity:
function_name → qualified_nameindex once (O(M))This reduces the iteration count from ~66 million to ~17,640 for n8n (3,700x improvement).
Also removed the
func.function_name in sourcefallback check because:Performance Results
Tested on n8n repository (12,138 functions, 5,502 test files):
Testing
Files Changed
codeflash/languages/javascript/support.py-discover_tests()methodRelated Issues
Fixes the issue where running
codeflash --allon large JavaScript/TypeScript projects would appear to hang after function discovery, making the CLI unusable on real-world monorepos.