perf: optimize context extraction with caching and import pre-filtering by KRRT7 · Pull Request #1921 · codeflash-ai/codeflash

KRRT7 · 2026-03-27T20:53:41Z

Summary

Builds on top of #1918 with 4 additional optimizations to the context extraction pipeline, achieving ~42% further speedup (median 2.93s → 1.71s).

Cache Jedi refs_by_parent across calls — both get_function_sources_from_jedi calls operate on the same file; caching the expensive get_names() + parent grouping eliminates redundant Jedi work
Reuse RO result for TESTGEN — when pruned code is identical between READ_ONLY and TESTGEN contexts, skip the duplicate add_needed_imports_from_module call
Pre-filter imports by AST-collected referenced names — fast ast.walk collects names actually used in the destination code, so only relevant imports are added (reducing the module size RemoveImportsVisitor has to analyze)
Skip RemoveImportsVisitor when dst has no pre-existing imports — after pruning, most destination code has no imports; since we already pre-filter additions, scope analysis is unnecessary

Test plan

All 2919 existing tests pass (verified locally)
Benchmark: tests/benchmarks/test_benchmark_code_extract_code_context.py::test_benchmark_extract

claude · 2026-03-27T20:54:16Z

Claude finished @KRRT7's task in 5m 21s —— View job

PR Review Summary

Triage PR size — LARGE (189 lines production code across 3 files)
Lint and typecheck — prek passed; mypy errors in support.py are pre-existing
Resolve stale threads — none found
Review code
Duplicate detection
Test coverage analysis
Post summary

Prek Checks

uv run prek run --from-ref origin/cf-cpu-context-extraction passed with no issues. No commits needed.

Code Review

No runtime bugs or breaking changes found. A few observations:

Jedi `refs_by_parent` cache (`code_context_extractor.py`)

The return type of get_function_sources_from_jedi changes to a 3-tuple. All three call sites are correctly updated (two in code_context_extractor.py, one in support.py:351). The cache is scoped to a single get_code_optimization_context call (same Jedi project), so no stale-cache risk. Converting defaultdict → dict before storing is correct since callers only use .get(). ✓

RO/TESTGEN reuse (`code_context_extractor.py`)

The reuse condition fto_ro_code_result is not None and fto_testgen_pruned_code_str == fto_ro_pruned_code_str is correct. When pruned code strings are identical, the add_needed_imports_from_module output must also be identical (same source module, same dst, same helpers). ✓

Import pre-filtering (`code_extractor.py`)

The _collect_dst_referenced_names function is comprehensive: it handles direct Name nodes, base names of Attribute chains, string annotations (via ast.parse on Constant string nodes), and flags has_imports. Edge cases checked:

from __future__ import annotations (PEP 563): string annotations are parsed by the ast.Constant handler ✓
import os.path → bound name os captured via ast.Name walker ✓
__future__ imports unconditionally added regardless of name check ✓

Design observation — The pre-filter is name-presence-based, not scope-based. There's a corner case where a name appearing in the destination refers to a local variable/parameter rather than the imported name (e.g., def foo(collections): return collections when the source has import collections). In the old code, RemoveImportsVisitor would detect this and remove the spurious import. With dst_has_imports=False, the import stays. This is a false-positive direction (extra import in context), not a false-negative (missed needed import), so it's harmless for correctness of the optimization pipeline. Worth noting as a known trade-off.

Fix this →

Minor dead code — At code_extractor.py:705, parsed_dst_module is always a cst.Module at that point, so else dst_code_fallback is unreachable. Not a bug.

Skip `RemoveImportsVisitor` (`code_extractor.py`)

Correct: when dst_has_imports=False, the only imports in the transformed module came from AddImportsVisitor, which pre-filtered against dst_referenced_names. So scope analysis by RemoveImportsVisitor adds no value. ✓

Duplicate Detection

collect_referenced_names (line 100, libcst-based, names only) and the new _collect_dst_referenced_names (ast-based, includes has_imports) serve distinct purposes. No duplicates.

Test Coverage

The new _collect_dst_referenced_names function has no direct unit tests. It's exercised indirectly through the extensive test_add_needed_imports_from_module.py test suite, which should catch regressions in the pre-filtering logic. The Jedi cache path also has no dedicated tests.

Given the 1.69x total speedup with all 2,919 existing tests passing, the correctness case looks solid. Consider adding a direct unit test for _collect_dst_referenced_names covering the string annotation and has_imports detection paths.

Last updated: 2026-03-27

…ng (~42% faster) Cache Jedi refs_by_parent across calls, reuse RO results for identical TESTGEN pruning, pre-filter imports by AST-collected referenced names, and skip RemoveImportsVisitor when dst has no pre-existing imports.

KRRT7 · 2026-03-27T21:22:31Z

Benchmark Results: Context Extraction Pipeline

End-to-end benchmark comparison using codeflash's --codeflash-trace system against test_benchmark_extract (which runs get_code_optimization_context on codeflash's own function_optimizer.py).

Three branches compared:

main — baseline before any changes
PR perf: optimize context extraction pipeline (~2x speedup) #1920 (cf-cpu-context-extraction) — MetadataWrapper removal, pre-computed base_defs, _has_aliased_future_imports fast-path
PR perf: optimize context extraction with caching and import pre-filtering #1921 (cf-deep-context-extraction) — Jedi refs_by_parent cache, RO/TESTGEN reuse, import pre-filtering, RemoveImportsVisitor skip

Key results

Function	main	PR #1921	Speedup
`collect_top_level_defs_with_dependencies`	3,086 ms	577 ms	5.35x
`add_needed_imports_from_module`	2,316 ms	677 ms	3.42x
`get_function_sources_from_jedi`	2,043 ms	1,392 ms	1.47x
`remove_unused_definitions_by_function_names`	1,139 ms	4 ms	282x
`gather_source_imports`	752 ms	159 ms	4.73x
Total benchmark	19,925 ms	11,767 ms	1.69x

Profiled on Python 3.13.7, macOS, with @codeflash_trace instrumentation on 8 functions across 3 files.

codeflash-ai · 2026-03-27T21:35:31Z

codeflash/languages/python/static_analysis/code_extractor.py

+    for node in ast.walk(tree):
+        if isinstance(node, ast.Name):
+            names.add(node.id)
+        elif isinstance(node, ast.Attribute) and isinstance(node.value, ast.Name):
+            names.add(node.value.id)
+        elif isinstance(node, (ast.Import, ast.ImportFrom)):
+            has_imports = True
+        elif isinstance(node, ast.Constant) and isinstance(node.value, str):
+            try:
+                inner = ast.parse(node.value, mode="eval")
+                for inner_node in ast.walk(inner):
+                    if isinstance(inner_node, ast.Name):
+                        names.add(inner_node.id)
+            except SyntaxError:
+                pass


⚡️Codeflash found 34% (0.34x) speedup for _collect_dst_referenced_names in codeflash/languages/python/static_analysis/code_extractor.py

⏱️ Runtime : 47.4 milliseconds → 35.5 milliseconds (best of 5 runs)

⚡️ This change will improve the performance of the following benchmarks:

Benchmark File :: Function Original Runtime Expected New Runtime Speedup

tests.benchmarks.test_benchmark_code_extract_code_context::test_benchmark_extract 8.34 seconds 8.34 seconds 0.01%

🔻 This change will degrade the performance of the following benchmarks:

{benchmark_info_degraded}

📝 Explanation and details

The function replaced ast.walk (a generator that yields every node) with an explicit stack-based traversal that skips subtrees early when type checks fail and avoids redundant isinstance calls by pre-binding type constants (NameType, AttributeType, etc.). It also introduced a local cache for parsed string annotations (_str_parse_cache) to avoid reparsing identical type-hint strings—profiler data shows this eliminated repeated ast.parse overhead inside the constant-string handling branch. Manual field iteration via _fields and getattr replaces the generator's internal iteration, cutting per-node dispatch cost from ~2961 ns to ~129 ns in the hot loop (line profiler confirms the main traversal loop dropped from 56.5% to 2.8% of runtime). The 33% speedup justifies the slightly longer code.

✅ Correctness verification report:

Test Status

⚙️ Existing Unit Tests 🔘 None Found

🌀 Generated Regression Tests ✅ 65 Passed

⏪ Replay Tests ✅ 6 Passed

🔎 Concolic Coverage Tests 🔘 None Found

📊 Tests Coverage 100.0%

🌀 Click to see Generated Regression Tests

import pytest # used for our unit tests from codeflash.languages.python.static_analysis.code_extractor import _collect_dst_referenced_names def test_basic_empty_and_syntax_error(): # An empty module should parse and yield no names and no imports. names, has_imports = _collect_dst_referenced_names("") # 27.2μs -> 18.9μs (43.9% faster) assert names == set() # 28.8μs -> 29.3μs (1.55% slower) assert has_imports is False # no imports present # A syntactically invalid module should return (set(), False) per function design. names, has_imports = _collect_dst_referenced_names("def incomplete(") assert names == set() # function catches SyntaxError and returns empty set assert has_imports is False # and indicates no imports def test_collect_names_and_attributes(): # Basic Name nodes: variable references and assignments should be collected. code = """ a = b + c d = a + 1 """ names, has_imports = _collect_dst_referenced_names(code) # 56.9μs -> 41.3μs (37.7% faster) # 'a', 'b', 'c', 'd' appear as Name nodes in the AST and should be collected. assert {"a", "b", "c", "d"}.issubset(names) assert has_imports is False # 35.8μs -> 29.4μs (21.8% faster) # Attribute access should capture only the base Name (the left-most Name value). code2 = """ x.y obj.attr (p).q """ names2, _ = _collect_dst_referenced_names(code2) # Base names 'x', 'obj', 'p' should be included. assert {"x", "obj", "p"}.issubset(names2) # Attribute names 'y' and 'attr' and 'q' should not be included as Name nodes. assert "y" not in names2 assert "attr" not in names2 assert "q" not in names2 def test_strings_parsed_as_code_and_invalid_inner_are_ignored(): # String constants that contain valid Python expressions should be parsed and names extracted. # Example: a constant containing "List[int]" will parse as a Subscript with Name nodes. code = 'const1 = "List[int]"\nconst2 = "MyType"\n' names, _ = _collect_dst_referenced_names(code) # 62.1μs -> 62.8μs (1.12% slower) # 'List', 'int', and 'MyType' should be discovered from the inner parses. assert "List" in names assert "int" in names # 23.0μs -> 21.1μs (8.91% faster) assert "MyType" in names # If the string constant contains invalid Python for eval mode, it should be ignored (no exception). code_with_invalid = 'bad = "a =" \n' # "a =" is not a valid expression -> inner SyntaxError and ignored names_bad, _ = _collect_dst_referenced_names(code_with_invalid) # Nothing valid should be collected from the invalid inner string. assert "a" not in names_bad def test_imports_flag_and_imported_names_not_collected_from_import_stmt(): # Import statements should flip the has_imports flag. code = """ import os, json from sys import path as sys_path # but do not reference those module names anywhere else """ names, has_imports = _collect_dst_referenced_names(code) # 35.7μs -> 29.0μs (23.1% faster) # has_imports must be True when imports are present. assert has_imports is True # The mere presence of import statements should not cause the module names to be added as Names # unless they appear elsewhere as Name nodes. So 'os' and 'sys' shouldn't be in the result. assert "os" not in names assert "sys" not in names assert "json" not in names assert "path" not in names def test_attribute_chains_add_base_name_even_when_nested(): # For chained attributes like a.b.c the AST contains nested Attribute nodes; # the inner Attribute has value=Name('a') so 'a' will be collected. code = "a.b.c\n" names, _ = _collect_dst_referenced_names(code) # 28.9μs -> 21.4μs (34.9% faster) assert "a" in names # base of the chain should be present # The intermediate attribute names 'b' and final attribute 'c' are not Name nodes and should not be present. assert "b" not in names # 23.8μs -> 25.5μs (6.60% slower) assert "c" not in names # An attribute where the value is a Constant (e.g., a literal) should not spuriously add names. code2 = "'literal'.prop\n" names2, _ = _collect_dst_referenced_names(code2) # No Name nodes should be created from a constant string's attribute access. assert "literal" not in names2 def test_none_input_raises_type_error(): # Passing None should raise a TypeError from ast.parse; the function does not catch this. with pytest.raises(TypeError): _collect_dst_referenced_names(None) # 6.14μs -> 6.25μs (1.75% slower) def test_large_scale_performance_and_correctness(): # Generate a reasonably large module with many distinct Name occurrences and many string-constants # that themselves contain simple names to be parsed. N = 1000 # large scale as requested parts = [] # Create many variable assignments to produce Name nodes for v0..vN-1 for i in range(N): parts.append(f"v{i} = {i}\n") # assignment target is a Name node 'v{i}' # Add many string constants that each contain a single simple name like "n{i}" which will be parsed for i in range(N): parts.append(f's{i} = "n{i}"\n') # constant string -> inner parse yields Name 'n{i}' # Add an import to ensure has_imports becomes True parts.append("import math\n") # Combine into a single large source string big_code = "".join(parts) names, has_imports = _collect_dst_referenced_names(big_code) # 18.1ms -> 15.1ms (19.9% faster) # The has_imports flag must be True because of the import statement. assert has_imports is True # Check a few representative items to ensure correctness and that the parsing actually collected names. # Check first, middle, and last indices for v* and n*. assert "v0" in names assert f"v{N // 2}" in names assert f"v{N - 1}" in names assert "n0" in names assert f"n{N // 2}" in names assert f"n{N - 1}" in names # Expect at least 2*N names (v* and n*) to be present. # Use >= to be resilient to possible additional names from parsing (but ensure the bulk was collected). assert len(names) >= 2 * N

# imports from codeflash.languages.python.static_analysis.code_extractor import _collect_dst_referenced_names def test_empty_code(): """Test that empty code returns empty set and no imports.""" names, has_imports = _collect_dst_referenced_names("") # 18.1μs -> 12.9μs (40.9% faster) assert names == set() assert has_imports is False def test_single_name_reference(): """Test collection of a single name reference.""" names, has_imports = _collect_dst_referenced_names("x") # 26.9μs -> 18.4μs (46.4% faster) assert names == {"x"} assert has_imports is False def test_multiple_name_references(): """Test collection of multiple distinct names.""" names, has_imports = _collect_dst_referenced_names("x + y + z") # 36.6μs -> 25.1μs (45.6% faster) assert names == {"x", "y", "z"} assert has_imports is False def test_repeated_name_references(): """Test that repeated name references appear only once in the set.""" names, has_imports = _collect_dst_referenced_names("x + x + y + y + y") # 41.4μs -> 26.8μs (54.2% faster) assert names == {"x", "y"} assert has_imports is False def test_attribute_access_single_level(): """Test collection of single-level attribute access (base name only).""" names, has_imports = _collect_dst_referenced_names("obj.attr") # 28.4μs -> 20.7μs (37.4% faster) assert names == {"obj"} assert has_imports is False def test_attribute_access_multi_level(): """Test collection of multi-level attribute access (base name only).""" names, has_imports = _collect_dst_referenced_names("obj.attr1.attr2.attr3") # 34.0μs -> 24.7μs (37.4% faster) assert names == {"obj"} assert has_imports is False def test_simple_import_statement(): """Test detection of simple import statements.""" names, has_imports = _collect_dst_referenced_names("import os") # 22.2μs -> 16.6μs (33.6% faster) assert has_imports is True # os is referenced as a Name node due to import assert "os" in names def test_import_from_statement(): """Test detection of from-import statements.""" names, has_imports = _collect_dst_referenced_names("from os import path") # 25.0μs -> 18.6μs (34.0% faster) assert has_imports is True def test_function_call(): """Test collection of names in function calls.""" names, has_imports = _collect_dst_referenced_names("func(arg1, arg2)") # 36.8μs -> 27.2μs (35.6% faster) assert "func" in names assert "arg1" in names assert "arg2" in names assert has_imports is False def test_variable_assignment(): """Test collection of names in variable assignments.""" names, has_imports = _collect_dst_referenced_names("result = func(x, y)") # 38.4μs -> 26.5μs (44.8% faster) assert "result" in names assert "func" in names assert "x" in names assert "y" in names assert has_imports is False def test_string_annotation_with_name(): """Test collection of names inside string annotations.""" names, has_imports = _collect_dst_referenced_names('x: "MyType" = 5') # 41.1μs -> 41.8μs (1.77% slower) assert "MyType" in names assert "x" in names assert has_imports is False def test_string_annotation_with_multiple_names(): """Test collection of multiple names in string annotations.""" names, has_imports = _collect_dst_referenced_names( 'def f(x: "List[int]") -> "Optional[str]": pass' ) # 68.6μs -> 68.3μs (0.573% faster) assert "List" in names assert "int" in names assert "Optional" in names assert "str" in names assert "f" in names assert "x" in names assert has_imports is False def test_syntax_error_recovery(): """Test that syntax errors return empty set and no imports flag.""" names, has_imports = _collect_dst_referenced_names( "if this is not valid syntax !!!" ) # 42.7μs -> 45.5μs (6.08% slower) assert names == set() assert has_imports is False def test_incomplete_syntax(): """Test incomplete syntax that cannot be parsed.""" names, has_imports = _collect_dst_referenced_names("def func(") # 21.2μs -> 19.2μs (10.1% faster) assert names == set() assert has_imports is False def test_whitespace_only(): """Test code containing only whitespace.""" names, has_imports = _collect_dst_referenced_names(" \n\n \t\t ") # 13.7μs -> 9.30μs (47.4% faster) assert names == set() assert has_imports is False def test_comments_only(): """Test code containing only comments.""" names, has_imports = _collect_dst_referenced_names( "# This is a comment\n# Another comment" ) # 13.3μs -> 9.00μs (48.2% faster) assert names == set() assert has_imports is False def test_single_character_names(): """Test collection of single-character variable names.""" names, has_imports = _collect_dst_referenced_names("a + b + c") # 34.5μs -> 23.9μs (44.0% faster) assert names == {"a", "b", "c"} assert has_imports is False def test_underscore_names(): """Test collection of underscore-prefixed and underscore names.""" names, has_imports = _collect_dst_referenced_names( "_ + _var + __private + __dunder__" ) # 39.5μs -> 27.1μs (45.7% faster) assert "_" in names assert "_var" in names assert "__private" in names assert "__dunder__" in names assert has_imports is False def test_numeric_suffixed_names(): """Test collection of names with numeric suffixes.""" names, has_imports = _collect_dst_referenced_names( "var1 + var2 + var123 + var_456" ) # 37.9μs -> 26.3μs (44.0% faster) assert "var1" in names assert "var2" in names assert "var123" in names assert "var_456" in names assert has_imports is False def test_keyword_like_but_not_keyword(): """Test that non-keyword identifiers that look like keywords are collected.""" names, has_imports = _collect_dst_referenced_names("ifx = 5") # 27.3μs -> 20.1μs (35.9% faster) assert "ifx" in names assert has_imports is False def test_nested_attribute_access(): """Test nested attribute access on attributes.""" names, has_imports = _collect_dst_referenced_names("a.b.c.d.e.f") # 36.8μs -> 26.6μs (37.9% faster) # Only the base name should be collected assert names == {"a"} assert has_imports is False def test_method_call_chain(): """Test method call chains.""" names, has_imports = _collect_dst_referenced_names( "obj.method1().method2().method3()" ) # 43.3μs -> 35.9μs (20.7% faster) assert "obj" in names assert has_imports is False def test_dict_with_names(): """Test collection of names in dictionary literals.""" names, has_imports = _collect_dst_referenced_names( "{key1: value1, key2: value2}" ) # 37.0μs -> 28.0μs (32.5% faster) assert "key1" in names assert "value1" in names assert "key2" in names assert "value2" in names assert has_imports is False def test_list_comprehension(): """Test collection of names in list comprehensions.""" names, has_imports = _collect_dst_referenced_names( "[x * 2 for x in items if x > threshold]" ) # 55.8μs -> 43.7μs (27.8% faster) assert "x" in names assert "items" in names assert "threshold" in names assert has_imports is False def test_lambda_expression(): """Test collection of names in lambda expressions.""" names, has_imports = _collect_dst_referenced_names("lambda x, y: x + y + z") # 48.0μs -> 34.4μs (39.7% faster) assert "x" in names assert "y" in names assert "z" in names assert has_imports is False def test_string_literal_non_annotation(): """Test that regular string literals are ignored.""" names, has_imports = _collect_dst_referenced_names('message = "hello world"') # 41.9μs -> 37.7μs (11.2% faster) assert "message" in names # The string content should not be parsed as code assert "hello" not in names assert "world" not in names assert has_imports is False def test_invalid_string_annotation(): """Test that invalid expressions in string annotations are handled gracefully.""" names, has_imports = _collect_dst_referenced_names( 'x: "not valid python !!! @#$" = 5' ) # 39.6μs -> 34.1μs (16.3% faster) assert "x" in names # The string contains invalid Python, so no names are extracted from it assert has_imports is False def test_complex_expression(): """Test a complex expression with multiple operators and names.""" names, has_imports = _collect_dst_referenced_names("(a + b) * (c - d) / (e + f)") # 53.7μs -> 38.3μs (40.1% faster) assert "a" in names assert "b" in names assert "c" in names assert "d" in names assert "e" in names assert "f" in names assert has_imports is False def test_multiple_imports(): """Test detection with multiple import types.""" code = """ import os from sys import argv import json as j """ names, has_imports = _collect_dst_referenced_names(code) # 35.7μs -> 27.3μs (30.9% faster) assert has_imports is True def test_imports_and_names(): """Test that both imports and other names are collected.""" code = """ import os x = os.path.join(a, b) """ names, has_imports = _collect_dst_referenced_names(code) # 49.7μs -> 37.3μs (33.3% faster) assert has_imports is True assert "os" in names assert "x" in names assert "a" in names assert "b" in names def test_conditional_code(): """Test collection from conditional statements.""" names, has_imports = _collect_dst_referenced_names(""" if condition: result = func(x) else: result = other_func(y) """) # 56.1μs -> 41.2μs (36.3% faster) assert "condition" in names assert "result" in names assert "func" in names assert "x" in names assert "other_func" in names assert "y" in names assert has_imports is False def test_loop_code(): """Test collection from loop statements.""" names, has_imports = _collect_dst_referenced_names(""" for item in items: process(item) """) # 41.7μs -> 30.1μs (38.9% faster) assert "item" in names assert "items" in names assert "process" in names assert has_imports is False def test_class_definition(): """Test collection of names in class definitions.""" names, has_imports = _collect_dst_referenced_names(""" class MyClass(BaseClass): def __init__(self, x): self.x = x """) # 59.3μs -> 47.8μs (23.9% faster) assert "MyClass" in names assert "BaseClass" in names assert "self" in names assert "x" in names assert has_imports is False def test_function_definition(): """Test collection of names in function definitions.""" names, has_imports = _collect_dst_referenced_names(""" def my_func(arg1, arg2): return arg1 + arg2 + external_var """) # 52.2μs -> 40.5μs (28.8% faster) assert "my_func" in names assert "arg1" in names assert "arg2" in names assert "external_var" in names assert has_imports is False def test_nested_attribute_on_literal(): """Test attribute access on non-Name objects (should not collect base).""" names, has_imports = _collect_dst_referenced_names('"string".upper()') # 38.9μs -> 41.9μs (7.19% slower) # The function only collects base names when attribute.value is ast.Name assert has_imports is False def test_numeric_literal(): """Test that numeric literals don't cause issues.""" names, has_imports = _collect_dst_referenced_names("x = 42 + 3.14 + 1j") # 35.9μs -> 27.8μs (28.9% faster) assert "x" in names assert has_imports is False def test_boolean_literals(): """Test boolean and None literals.""" names, has_imports = _collect_dst_referenced_names( "result = True and False and None" ) # 34.7μs -> 26.8μs (29.6% faster) assert "result" in names assert has_imports is False def test_bytes_literal(): """Test bytes string literal.""" names, has_imports = _collect_dst_referenced_names("data = b'bytes'") # 25.2μs -> 18.3μs (37.8% faster) assert "data" in names assert has_imports is False def test_empty_attribute_chain(): """Test that we handle edge cases in attribute chains.""" names, has_imports = _collect_dst_referenced_names("(x).y.z") # 30.1μs -> 24.0μs (25.6% faster) assert "x" in names assert has_imports is False def test_starred_expression(): """Test starred expressions in unpacking.""" names, has_imports = _collect_dst_referenced_names("a, *b, c = items") # 36.6μs -> 26.3μs (39.5% faster) assert "a" in names assert "b" in names assert "c" in names assert "items" in names assert has_imports is False def test_ternary_expression(): """Test ternary conditional expressions.""" names, has_imports = _collect_dst_referenced_names( "result = x if condition else y" ) # 33.9μs -> 24.2μs (40.2% faster) assert "result" in names assert "x" in names assert "condition" in names assert "y" in names assert has_imports is False def test_many_distinct_names(): """Test collection with many distinct variable names (500 names).""" # Build code with 500 distinct names names_list = [f"var{i}" for i in range(500)] code = " + ".join(names_list) names, has_imports = _collect_dst_referenced_names(code) # 2.21ms -> 1.33ms (66.5% faster) # Verify all names were collected assert len(names) == 500 for i in range(500): assert f"var{i}" in names assert has_imports is False def test_large_nested_expression(): """Test deeply nested expressions (1000 levels of attribute access).""" # Build a chain: a.b.c.d... code = "a" for i in range(100): code += ".attr" names, has_imports = _collect_dst_referenced_names(code) # 247μs -> 164μs (50.0% faster) assert names == {"a"} assert has_imports is False def test_large_function_call_args(): """Test function call with many arguments (500 args).""" args = ", ".join([f"arg{i}" for i in range(500)]) code = f"result = func({args})" names, has_imports = _collect_dst_referenced_names(code) # 1.40ms -> 895μs (56.9% faster) assert "result" in names assert "func" in names assert len(names) == 502 # result + func + 500 args assert has_imports is False def test_large_dictionary_literal(): """Test large dictionary with many key-value pairs (500 pairs).""" pairs = ", ".join([f"key{i}: val{i}" for i in range(500)]) code = f"d = {{{pairs}}}" names, has_imports = _collect_dst_referenced_names(code) # 2.73ms -> 1.67ms (63.7% faster) assert "d" in names # 1 for 'd' + 500 keys + 500 values = 1001 assert len(names) == 1001 assert has_imports is False def test_many_list_comprehensions(): """Test code with many list comprehensions (100 comprehensions).""" code = "\n".join([f"list{i} = [x for x in items{i}]" for i in range(100)]) names, has_imports = _collect_dst_referenced_names(code) # 1.48ms -> 952μs (55.7% faster) assert "x" in names for i in range(100): assert f"list{i}" in names assert f"items{i}" in names assert has_imports is False def test_large_class_hierarchy(): """Test a large class hierarchy (100 classes).""" code = "class C0: pass\n" for i in range(1, 100): code += f"class C{i}(C{i - 1}): pass\n" names, has_imports = _collect_dst_referenced_names(code) # 691μs -> 473μs (46.2% faster) for i in range(100): assert f"C{i}" in names assert has_imports is False def test_many_function_definitions(): """Test many function definitions with parameters (100 functions).""" code = "\n".join([f"def func{i}(p{i}_0, p{i}_1, p{i}_2): pass" for i in range(100)]) names, has_imports = _collect_dst_referenced_names(code) # 1.39ms -> 1.14ms (22.4% faster) for i in range(100): assert f"func{i}" in names assert f"p{i}_0" in names assert f"p{i}_1" in names assert f"p{i}_2" in names assert has_imports is False def test_many_string_annotations(): """Test many string annotations (200 annotations).""" code = "\n".join([f'x{i}: "TypeA{i}[TypeB{i}]" = None' for i in range(200)]) names, has_imports = _collect_dst_referenced_names(code) # 3.30ms -> 3.02ms (9.41% faster) # Verify some of the annotated names are present assert "x0" in names assert "TypeA0" in names assert "TypeB0" in names assert "x199" in names assert "TypeA199" in names assert "TypeB199" in names assert has_imports is False def test_very_long_single_line(): """Test very long single-line expression (1000 names concatenated).""" names_list = [f"v{i}" for i in range(1000)] code = " + ".join(names_list) names, has_imports = _collect_dst_referenced_names(code) # 4.21ms -> 2.44ms (72.6% faster) assert len(names) == 1000 assert has_imports is False def test_large_conditional_chain(): """Test large chain of if-elif-else (100 conditions).""" code = "if c0: x = 0\n" for i in range(1, 100): code += f"elif c{i}: x = {i}\n" code += "else: x = -1\n" names, has_imports = _collect_dst_referenced_names(code) # 1.03ms -> 644μs (60.1% faster) assert "x" in names for i in range(100): assert f"c{i}" in names assert has_imports is False def test_deeply_nested_list_comprehensions(): """Test deeply nested list comprehensions (50 levels).""" code = "[x" for i in range(50): code += f" for x{i} in items{i}" code += "]" names, has_imports = _collect_dst_referenced_names(code) # 339μs -> 199μs (70.7% faster) assert "x" in names for i in range(50): assert f"x{i}" in names assert f"items{i}" in names assert has_imports is False def test_mixed_large_code(): """Test large code combining imports, functions, classes, and expressions.""" code = """ import os from sys import argv import json class MyClass(BaseClass): def __init__(self): self.value = 0 def process_data(data): result = transform(data) return result x = MyClass() data = [item for item in source if predicate(item)] output = process_data(data) final = final_transform(output) """ names, has_imports = _collect_dst_referenced_names(code) # 173μs -> 127μs (36.1% faster) assert has_imports is True assert "os" in names assert "argv" in names assert "json" in names assert "MyClass" in names assert "BaseClass" in names assert "process_data" in names assert "data" in names assert "item" in names assert "source" in names assert "predicate" in names assert len(names) > 20 def test_code_with_many_attributes(): """Test code with many attribute accesses on different base objects (500).""" code = "\n".join([f"v{i}.attr1.attr2.attr3" for i in range(500)]) names, has_imports = _collect_dst_referenced_names(code) # 5.60ms -> 3.85ms (45.3% faster) assert len(names) == 500 for i in range(500): assert f"v{i}" in names assert has_imports is False

⏪ Click to see Replay Tests

Test File::Test Function Original ⏱️ Optimized ⏱️ Speedup

benchmarks/codeflash_replay_tests_pj5gk87v/test_tests_benchmarks_test_benchmark_code_extract_code_context__replay_test_0.py::test_codeflash_languages_python_static_analysis_code_extractor__collect_dst_referenced_names_test_benchmark_extract 2.62ms 1.94ms 34.9%✅

To test or edit this optimization locally git merge codeflash/optimize-pr1921-2026-03-27T21.35.30

Click to see suggested changes

Suggested change

for node in ast.walk(tree):

if isinstance(node, ast.Name):

names.add(node.id)

elif isinstance(node, ast.Attribute) and isinstance(node.value, ast.Name):

names.add(node.value.id)

elif isinstance(node, (ast.Import, ast.ImportFrom)):

has_imports = True

elif isinstance(node, ast.Constant) and isinstance(node.value, str):

try:

inner = ast.parse(node.value, mode="eval")

for inner_node in ast.walk(inner):

if isinstance(inner_node, ast.Name):

names.add(inner_node.id)

except SyntaxError:

pass

# Small cache for parsed string annotations to avoid reparsing identical strings.

_str_parse_cache: dict[str, frozenset[str]] = {}

# Manual stack-based traversal is typically faster than using the generator-based ast.walk.

stack = [tree]

NameType = ast.Name

AttributeType = ast.Attribute

ImportTypes = (ast.Import, ast.ImportFrom)

ConstantType = ast.Constant

while stack:

node = stack.pop()

if isinstance(node, NameType):

names.add(node.id)

continue

if isinstance(node, AttributeType) and isinstance(node.value, NameType):

names.add(node.value.id)

# continue to still traverse any children if present (though attribute.value already handled)

# we still push children below

elif isinstance(node, ImportTypes):

has_imports = True

elif isinstance(node, ConstantType) and isinstance(node.value, str):

s = node.value

cached = _str_parse_cache.get(s)

if cached is not None:

if cached:

names.update(cached)

else:

try:

inner = ast.parse(s, mode="eval")

except SyntaxError:

_str_parse_cache[s] = frozenset()

else:

inner_names: set[str] = set()

for inner_node in ast.walk(inner):

if isinstance(inner_node, NameType):

inner_names.add(inner_node.id)

frozen = frozenset(inner_names)

_str_parse_cache[s] = frozen

if frozen:

names.update(frozen)

# Push child AST nodes for traversal. Using _fields avoids creating many intermediate lists.

fields = getattr(node, "_fields", ())

for field in fields:

value = getattr(node, field, None)

if isinstance(value, list):

# iterate in normal order; append to stack to preserve overall traversal behavior

for item in value:

if isinstance(item, ast.AST):

stack.append(item)

elif isinstance(value, ast.AST):

stack.append(value)

KRRT7 force-pushed the cf-deep-context-extraction branch from 0e91155 to 85e56d7 Compare March 27, 2026 20:56

KRRT7 force-pushed the cf-deep-context-extraction branch from 85e56d7 to 0593723 Compare March 27, 2026 20:57

KRRT7 and others added 2 commits March 27, 2026 16:01

fix: add type params to bare list annotations for mypy

229f5f4

docs: add benchmark comparison image for PRs #1920/#1921

85eafaa

codeflash-ai bot reviewed Mar 27, 2026

View reviewed changes

Base automatically changed from cf-cpu-context-extraction to main March 27, 2026 22:37

KRRT7 merged commit f169e86 into main Mar 27, 2026
29 of 31 checks passed

KRRT7 deleted the cf-deep-context-extraction branch March 27, 2026 22:37

KRRT7 mentioned this pull request Mar 27, 2026

feat: add codeflash compare CLI command #1922

Merged

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf: optimize context extraction with caching and import pre-filtering#1921

perf: optimize context extraction with caching and import pre-filtering#1921
KRRT7 merged 3 commits intomainfrom
cf-deep-context-extraction

KRRT7 commented Mar 27, 2026

Uh oh!

claude bot commented Mar 27, 2026 •

edited

Loading

Uh oh!

KRRT7 commented Mar 27, 2026

Uh oh!

codeflash-ai bot Mar 27, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Test	Status
⚙️ Existing Unit Tests	🔘 None Found
🌀 Generated Regression Tests	✅ 65 Passed
⏪ Replay Tests	✅ 6 Passed
🔎 Concolic Coverage Tests	🔘 None Found
📊 Tests Coverage	100.0%

-    for node in ast.walk(tree):
-        if isinstance(node, ast.Name):
-            names.add(node.id)
-        elif isinstance(node, ast.Attribute) and isinstance(node.value, ast.Name):
-            names.add(node.value.id)
-        elif isinstance(node, (ast.Import, ast.ImportFrom)):
-            has_imports = True
-        elif isinstance(node, ast.Constant) and isinstance(node.value, str):
-            try:
-                inner = ast.parse(node.value, mode="eval")
-                for inner_node in ast.walk(inner):
-                    if isinstance(inner_node, ast.Name):
-                        names.add(inner_node.id)
-            except SyntaxError:
-                pass
+    # Small cache for parsed string annotations to avoid reparsing identical strings.
+    _str_parse_cache: dict[str, frozenset[str]] = {}
+    # Manual stack-based traversal is typically faster than using the generator-based ast.walk.
+    stack = [tree]
+    NameType = ast.Name
+    AttributeType = ast.Attribute
+    ImportTypes = (ast.Import, ast.ImportFrom)
+    ConstantType = ast.Constant
+    while stack:
+        node = stack.pop()
+        if isinstance(node, NameType):
+            names.add(node.id)
+            continue
+        if isinstance(node, AttributeType) and isinstance(node.value, NameType):
+            names.add(node.value.id)
+            # continue to still traverse any children if present (though attribute.value already handled)
+            # we still push children below
+        elif isinstance(node, ImportTypes):
+            has_imports = True
+        elif isinstance(node, ConstantType) and isinstance(node.value, str):
+            s = node.value
+            cached = _str_parse_cache.get(s)
+            if cached is not None:
+                if cached:
+                    names.update(cached)
+            else:
+                try:
+                    inner = ast.parse(s, mode="eval")
+                except SyntaxError:
+                    _str_parse_cache[s] = frozenset()
+                else:
+                    inner_names: set[str] = set()
+                    for inner_node in ast.walk(inner):
+                        if isinstance(inner_node, NameType):
+                            inner_names.add(inner_node.id)
+                    frozen = frozenset(inner_names)
+                    _str_parse_cache[s] = frozen
+                    if frozen:
+                        names.update(frozen)
+        # Push child AST nodes for traversal. Using _fields avoids creating many intermediate lists.
+        fields = getattr(node, "_fields", ())
+        for field in fields:
+            value = getattr(node, field, None)
+            if isinstance(value, list):
+                # iterate in normal order; append to stack to preserve overall traversal behavior
+                for item in value:
+                    if isinstance(item, ast.AST):
+                        stack.append(item)
+            elif isinstance(value, ast.AST):
+                stack.append(value)

Conversation

KRRT7 commented Mar 27, 2026

Summary

Test plan

Uh oh!

claude bot commented Mar 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Review Summary

Prek Checks

Code Review

Jedi refs_by_parent cache (code_context_extractor.py)

RO/TESTGEN reuse (code_context_extractor.py)

Import pre-filtering (code_extractor.py)

Skip RemoveImportsVisitor (code_extractor.py)

Duplicate Detection

Test Coverage

Uh oh!

KRRT7 commented Mar 27, 2026

Benchmark Results: Context Extraction Pipeline

Key results

Uh oh!

codeflash-ai bot Mar 27, 2026

Choose a reason for hiding this comment

⚡️Codeflash found 34% (0.34x) speedup for _collect_dst_referenced_names in codeflash/languages/python/static_analysis/code_extractor.py

⚡️ This change will improve the performance of the following benchmarks:

🔻 This change will degrade the performance of the following benchmarks:

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

claude bot commented Mar 27, 2026 •

edited

Loading

Jedi `refs_by_parent` cache (`code_context_extractor.py`)

RO/TESTGEN reuse (`code_context_extractor.py`)

Import pre-filtering (`code_extractor.py`)

Skip `RemoveImportsVisitor` (`code_extractor.py`)

⚡️Codeflash found 34% (0.34x) speedup for `_collect_dst_referenced_names` in `codeflash/languages/python/static_analysis/code_extractor.py`