Conversation
d2050b1 to
77cddec
Compare
| project_root = Path.cwd() | ||
|
|
||
| # Check for existing codeflash config in pom.xml or a separate config file | ||
| codeflash_config_path = project_root / "codeflash.toml" | ||
| if codeflash_config_path.exists(): |
There was a problem hiding this comment.
⚡️Codeflash found 70% (0.70x) speedup for should_modify_java_config in codeflash/cli_cmds/init_java.py
⏱️ Runtime : 714 microseconds → 421 microseconds (best of 60 runs)
📝 Explanation and details
The optimized code achieves a 69% speedup (714μs → 421μs) by replacing pathlib.Path operations with equivalent os module functions, which have significantly lower overhead.
Key optimizations:
-
os.getcwd()instead ofPath.cwd(): The line profiler showsPath.cwd()took 689,637ns (34.1% of total time) vsos.getcwd()taking only 68,036ns (7.4%). This is a ~10x improvement becausePath.cwd()instantiates a Path object and performs additional normalization, whileos.getcwd()returns a raw string from a system call. -
os.path.join()instead of Path division operator: Constructing the config path viaproject_root / "codeflash.toml"took 386,582ns (19.1%) vsos.path.join()taking 190,345ns (20.6%). Though the percentage appears similar, the absolute time is ~50% faster because the/operator creates a new Path object with its associated overhead. -
os.path.exists()instead ofPath.exists(): The existence check dropped from 476,490ns (23.6%) to 223,477ns (24.2%) - roughly 2x faster. Theos.path.exists()function directly calls the stat syscall, whilePath.exists()goes through Path's object model.
Why this works:
Path objects provide a cleaner API but add object instantiation, method dispatch, and normalization overhead. For simple filesystem checks in initialization code that runs frequently, using lower-level os functions eliminates this overhead while maintaining identical functionality.
Test results:
All test cases show 68-111% speedup across scenarios including:
- Empty directories (fastest: 82-87% improvement)
- Large directories with 500 files (68-111% improvement)
- Edge cases like symlinks and directory-as-file (75-82% improvement)
The optimization is particularly beneficial for CLI initialization code that may run on every command invocation, where sub-millisecond improvements in frequently-called functions compound into noticeable user experience gains.
✅ Correctness verification report:
| Test | Status |
|---|---|
| ⚙️ Existing Unit Tests | 🔘 None Found |
| 🌀 Generated Regression Tests | ✅ 23 Passed |
| ⏪ Replay Tests | 🔘 None Found |
| 🔎 Concolic Coverage Tests | 🔘 None Found |
| 📊 Tests Coverage | 100.0% |
🌀 Click to see Generated Regression Tests
from __future__ import annotations
# imports
import os
from pathlib import Path
from typing import Any
import pytest # used for our unit tests
from codeflash.cli_cmds.init_java import should_modify_java_config
def test_no_config_file_does_not_prompt_and_returns_true(monkeypatch, tmp_path):
# Arrange: ensure working directory has no codeflash.toml
monkeypatch.chdir(tmp_path) # set cwd to a clean temporary directory
# Replace Confirm.ask with a function that fails the test if called.
def fail_if_called(*args, **kwargs):
raise AssertionError("Confirm.ask should not be called when no config file exists")
# Patch the exact attribute that the function imports at runtime.
monkeypatch.setattr("rich.prompt.Confirm.ask", fail_if_called, raising=True)
# Act: call function under test
codeflash_output = should_modify_java_config(); result = codeflash_output # 28.9μs -> 15.9μs (82.0% faster)
def test_config_file_exists_prompts_and_respects_true_choice(monkeypatch, tmp_path):
# Arrange: create a codeflash.toml file so the function will detect it
monkeypatch.chdir(tmp_path)
config_file = tmp_path / "codeflash.toml"
config_file.write_text("existing = true") # create the file
# Capture the arguments passed to Confirm.ask and return True to simulate user acceptance
called = {}
def fake_ask(prompt, default, show_default):
# Record inputs for later assertions
called["prompt"] = prompt
called["default"] = default
called["show_default"] = show_default
return True
# Patch Confirm.ask used inside the function
monkeypatch.setattr("rich.prompt.Confirm.ask", fake_ask, raising=True)
# Act
codeflash_output = should_modify_java_config(); result = codeflash_output # 25.6μs -> 13.7μs (86.9% faster)
def test_config_file_exists_prompts_and_respects_false_choice(monkeypatch, tmp_path):
# Arrange: create the config file
monkeypatch.chdir(tmp_path)
(tmp_path / "codeflash.toml").write_text("existing = true")
# Simulate user declining re-configuration
def fake_ask_decline(prompt, default, show_default):
return False
monkeypatch.setattr("rich.prompt.Confirm.ask", fake_ask_decline, raising=True)
# Act
codeflash_output = should_modify_java_config(); result = codeflash_output # 24.7μs -> 13.3μs (86.3% faster)
def test_presence_of_pom_xml_does_not_trigger_prompt(monkeypatch, tmp_path):
# Arrange: create a pom.xml but NOT codeflash.toml
monkeypatch.chdir(tmp_path)
(tmp_path / "pom.xml").write_text("<project></project>")
# If Confirm.ask is called, fail the test because only codeflash.toml should trigger it in current implementation
def fail_if_called(*args, **kwargs):
raise AssertionError("Confirm.ask should not be called when only pom.xml exists (implementation checks codeflash.toml)")
monkeypatch.setattr("rich.prompt.Confirm.ask", fail_if_called, raising=True)
# Act
codeflash_output = should_modify_java_config(); result = codeflash_output # 28.3μs -> 16.6μs (69.9% faster)
def test_codeflash_config_is_directory_triggers_prompt(monkeypatch, tmp_path):
# Arrange: create a directory named codeflash.toml (Path.exists will be True)
monkeypatch.chdir(tmp_path)
(tmp_path / "codeflash.toml").mkdir()
# Simulate user selecting True
monkeypatch.setattr("rich.prompt.Confirm.ask", lambda *a, **k: True, raising=True)
# Act
codeflash_output = should_modify_java_config(); result = codeflash_output # 23.6μs -> 12.9μs (82.2% faster)
def test_codeflash_config_symlink_triggers_prompt_if_supported(monkeypatch, tmp_path):
# Arrange: attempt to create a symlink to a real file; skip if symlink not supported
if not hasattr(os, "symlink"):
pytest.skip("Platform does not support os.symlink; skipping symlink test")
real = tmp_path / "real_config"
real.write_text("x = 1")
link = tmp_path / "codeflash.toml"
try:
os.symlink(real, link) # may fail on Windows without privileges
except (OSError, NotImplementedError) as e:
pytest.skip(f"Could not create symlink on this platform/environment: {e}")
monkeypatch.chdir(tmp_path)
# Simulate user declining re-configuration
monkeypatch.setattr("rich.prompt.Confirm.ask", lambda *a, **k: False, raising=True)
# Act
codeflash_output = should_modify_java_config(); result = codeflash_output # 24.9μs -> 14.2μs (75.7% faster)
def test_large_directory_without_config_is_fast_and_does_not_prompt(monkeypatch, tmp_path):
# Large scale scenario: create many files (but under 1000) to simulate busy project directory.
monkeypatch.chdir(tmp_path)
num_files = 500 # under the 1000 element guideline
for i in range(num_files):
# Create many innocuous files; should not affect the function's behavior
(tmp_path / f"file_{i}.txt").write_text(str(i))
# Ensure Confirm.ask is not called
def fail_if_called(*args, **kwargs):
raise AssertionError("Confirm.ask should not be called when codeflash.toml is absent even in large directories")
monkeypatch.setattr("rich.prompt.Confirm.ask", fail_if_called, raising=True)
# Act
codeflash_output = should_modify_java_config(); result = codeflash_output # 36.3μs -> 21.6μs (68.0% faster)
def test_large_directory_with_config_prompts_once(monkeypatch, tmp_path):
# Large scale scenario with config present: many files plus codeflash.toml
monkeypatch.chdir(tmp_path)
num_files = 500
for i in range(num_files):
(tmp_path / f"file_{i}.txt").write_text(str(i))
# Create the config file that should trigger prompting
(tmp_path / "codeflash.toml").write_text("reconfigure = maybe")
# Track how many times Confirm.ask is invoked to ensure single prompt
counter = {"calls": 0}
def fake_ask(prompt, default, show_default):
counter["calls"] += 1
return True
monkeypatch.setattr("rich.prompt.Confirm.ask", fake_ask, raising=True)
# Act
codeflash_output = should_modify_java_config(); result = codeflash_output # 30.8μs -> 14.6μs (111% faster)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.import os
import tempfile
from pathlib import Path
from unittest.mock import MagicMock, patch
# imports
import pytest
from codeflash.cli_cmds.init_java import should_modify_java_config
class TestShouldModifyJavaConfigBasic:
"""Basic test cases for should_modify_java_config function."""
def test_no_config_file_exists_returns_true(self):
"""
Scenario: Project has no existing codeflash.toml file
Expected: Function returns (True, None) without prompting user
"""
# Create a temporary directory without codeflash.toml
with tempfile.TemporaryDirectory() as tmpdir:
original_cwd = os.getcwd()
try:
os.chdir(tmpdir)
codeflash_output = should_modify_java_config(); result = codeflash_output
finally:
os.chdir(original_cwd)
def test_config_file_exists_user_confirms(self):
"""
Scenario: Project has existing codeflash.toml and user confirms re-configuration
Expected: Function prompts user and returns (True, None) if user confirms
"""
with tempfile.TemporaryDirectory() as tmpdir:
original_cwd = os.getcwd()
try:
os.chdir(tmpdir)
# Create a codeflash.toml file
config_file = Path(tmpdir) / "codeflash.toml"
config_file.touch()
# Mock the Confirm.ask to return True (user confirms)
with patch('rich.prompt.Confirm.ask', return_value=True):
codeflash_output = should_modify_java_config(); result = codeflash_output
finally:
os.chdir(original_cwd)
def test_config_file_exists_user_declines(self):
"""
Scenario: Project has existing codeflash.toml and user declines re-configuration
Expected: Function prompts user and returns (False, None) if user declines
"""
with tempfile.TemporaryDirectory() as tmpdir:
original_cwd = os.getcwd()
try:
os.chdir(tmpdir)
# Create a codeflash.toml file
config_file = Path(tmpdir) / "codeflash.toml"
config_file.touch()
# Mock the Confirm.ask to return False (user declines)
with patch('rich.prompt.Confirm.ask', return_value=False):
codeflash_output = should_modify_java_config(); result = codeflash_output
finally:
os.chdir(original_cwd)
def test_return_tuple_structure(self):
"""
Scenario: Verify the function always returns a tuple with specific structure
Expected: Return value is a tuple of (bool, None)
"""
with tempfile.TemporaryDirectory() as tmpdir:
original_cwd = os.getcwd()
try:
os.chdir(tmpdir)
codeflash_output = should_modify_java_config(); result = codeflash_output
finally:
os.chdir(original_cwd)
class TestShouldModifyJavaConfigEdgeCases:
"""Edge case test cases for should_modify_java_config function."""
def test_config_file_exists_but_empty(self):
"""
Scenario: codeflash.toml file exists but is empty
Expected: File is still considered as existing, prompts user
"""
with tempfile.TemporaryDirectory() as tmpdir:
original_cwd = os.getcwd()
try:
os.chdir(tmpdir)
# Create an empty codeflash.toml file
config_file = Path(tmpdir) / "codeflash.toml"
config_file.write_text("")
with patch('rich.prompt.Confirm.ask', return_value=True):
codeflash_output = should_modify_java_config(); result = codeflash_output
finally:
os.chdir(original_cwd)
def test_config_file_with_content(self):
"""
Scenario: codeflash.toml file exists with actual TOML content
Expected: Prompts user regardless of file content
"""
with tempfile.TemporaryDirectory() as tmpdir:
original_cwd = os.getcwd()
try:
os.chdir(tmpdir)
# Create a codeflash.toml file with content
config_file = Path(tmpdir) / "codeflash.toml"
config_file.write_text("[codeflash]\nversion = 1\n")
with patch('rich.prompt.Confirm.ask', return_value=False):
codeflash_output = should_modify_java_config(); result = codeflash_output
finally:
os.chdir(original_cwd)
def test_config_file_case_sensitive(self):
"""
Scenario: Directory has 'Codeflash.toml' or 'CODEFLASH.TOML' instead of lowercase
Expected: Function only recognizes 'codeflash.toml' (case-sensitive on Unix)
"""
with tempfile.TemporaryDirectory() as tmpdir:
original_cwd = os.getcwd()
try:
os.chdir(tmpdir)
# Create a file with different casing
config_file = Path(tmpdir) / "Codeflash.toml"
config_file.touch()
codeflash_output = should_modify_java_config(); result = codeflash_output
finally:
os.chdir(original_cwd)
def test_config_file_is_directory_not_file(self):
"""
Scenario: codeflash.toml exists as a directory instead of a file
Expected: Path.exists() still returns True, prompts user
"""
with tempfile.TemporaryDirectory() as tmpdir:
original_cwd = os.getcwd()
try:
os.chdir(tmpdir)
# Create codeflash.toml as a directory
config_dir = Path(tmpdir) / "codeflash.toml"
config_dir.mkdir()
with patch('rich.prompt.Confirm.ask', return_value=True):
codeflash_output = should_modify_java_config(); result = codeflash_output
finally:
os.chdir(original_cwd)
To test or edit this optimization locally git merge codeflash/optimize-pr1199-2026-02-01T21.20.00
| project_root = Path.cwd() | |
| # Check for existing codeflash config in pom.xml or a separate config file | |
| codeflash_config_path = project_root / "codeflash.toml" | |
| if codeflash_config_path.exists(): | |
| project_root = os.getcwd() | |
| # Check for existing codeflash config in pom.xml or a separate config file | |
| codeflash_config_path = os.path.join(project_root, "codeflash.toml") | |
| if os.path.exists(codeflash_config_path): |
…2026-02-01T22.01.32 ⚡️ Speed up function `get_optimized_code_for_module` by 2,599% in PR #1199 (`omni-java`)
| if os.path.exists("mvnw"): | ||
| return "./mvnw" | ||
| if os.path.exists("mvnw.cmd"): |
There was a problem hiding this comment.
⚡️Codeflash found 32% (0.32x) speedup for find_maven_executable in codeflash/languages/java/build_tools.py
⏱️ Runtime : 584 microseconds → 441 microseconds (best of 81 runs)
📝 Explanation and details
The optimization achieves a 32% runtime improvement (from 584μs to 441μs) by replacing os.path.exists() with os.access() for file existence checks. This change delivers measurable performance gains across all test scenarios.
Key Optimization:
The code replaces os.path.exists("mvnw") with os.access("mvnw", os.F_OK). While both functions check for file existence, os.access() with the os.F_OK flag is more efficient because:
- It performs a direct system call (
access()) that's optimized for permission/existence checks os.path.exists()internally does additional path normalization and exception handling that adds overhead- For simple existence checks,
os.access()avoids Python-level abstraction layers
Performance Impact by Scenario:
The line profiler shows that the wrapper checks (lines checking for "mvnw" and "mvnw.cmd") improved from ~576ns + 139ns to ~317ns + 76ns - nearly 2x faster for these critical paths. Test results confirm consistent improvements:
- Wrapper present cases: 68-84% faster (5.78μs → 3.32μs)
- No wrapper, system Maven cases: 31-52% faster
- Edge cases (directories, symlinks): 56-77% faster
Why This Matters:
Based on the function references, find_maven_executable() is called from test infrastructure and build tool detection code. While not in an obvious hot loop, build tool detection typically occurs at project initialization and in test setup/teardown - contexts where this function may be called repeatedly. The optimization is particularly valuable when:
- Running large test suites that reinitialize build contexts frequently
- Working in CI/CD environments with repeated project setup
- Dealing with directories containing many files (test shows 77% improvement with 500 files present)
The optimization maintains identical semantics - both os.path.exists() and os.access(..., os.F_OK) return True for files, directories, and symlinks, ensuring backward compatibility while delivering consistent double-digit runtime improvements.
✅ Correctness verification report:
| Test | Status |
|---|---|
| ⚙️ Existing Unit Tests | 🔘 None Found |
| 🌀 Generated Regression Tests | ✅ 34 Passed |
| ⏪ Replay Tests | 🔘 None Found |
| 🔎 Concolic Coverage Tests | ✅ 1 Passed |
| 📊 Tests Coverage | 100.0% |
🌀 Click to see Generated Regression Tests
import os
import pathlib
import shutil
import pytest # used for our unit tests
from codeflash.languages.java.build_tools import find_maven_executable
def test_prefers_mvnw_wrapper_when_present(tmp_path, monkeypatch):
# Create an isolated temporary directory and switch to it
# so os.path.exists checks only our test files.
monkeypatch.chdir(tmp_path)
# Create a file named "mvnw" to simulate the Maven wrapper being present.
(tmp_path / "mvnw").write_text("#!/bin/sh\necho mvnw\n")
# Call the real function under test and assert it returns the wrapper path.
# According to implementation, when "mvnw" exists it should return "./mvnw".
codeflash_output = find_maven_executable() # 5.78μs -> 3.32μs (74.3% faster)
def test_returns_mvnw_cmd_when_only_windows_wrapper_exists(tmp_path, monkeypatch):
# Switch to a fresh temporary directory for isolation.
monkeypatch.chdir(tmp_path)
# Create only "mvnw.cmd" and ensure no plain "mvnw" exists.
(tmp_path / "mvnw.cmd").write_text("@echo off\necho mvnw.cmd\n")
# The function should detect "mvnw.cmd" and return that exact string.
codeflash_output = find_maven_executable() # 13.2μs -> 7.16μs (84.0% faster)
def test_prefers_mvnw_over_mvnw_cmd_when_both_present(tmp_path, monkeypatch):
# Ensure both wrapper files exist; "mvnw" should be preferred because it's checked first.
monkeypatch.chdir(tmp_path)
(tmp_path / "mvnw").write_text("#!/bin/sh\necho mvnw\n")
(tmp_path / "mvnw.cmd").write_text("@echo off\necho mvnw.cmd\n")
# Confirm that "./mvnw" is returned, demonstrating the precedence.
codeflash_output = find_maven_executable() # 5.58μs -> 3.32μs (68.3% faster)
def test_returns_system_mvn_when_no_wrappers(monkeypatch, tmp_path):
# Make sure current directory has no wrapper files.
monkeypatch.chdir(tmp_path)
# Monkeypatch shutil.which to simulate an installed mvn on PATH.
monkeypatch.setattr(shutil, "which", lambda name: "/usr/bin/mvn" if name == "mvn" else None)
# The function should return whatever shutil.which returns when no wrappers present.
codeflash_output = find_maven_executable() # 14.0μs -> 9.18μs (52.3% faster)
def test_returns_none_when_nothing_found(monkeypatch, tmp_path):
# No wrapper files in cwd.
monkeypatch.chdir(tmp_path)
# Simulate no mvn on PATH by returning None (or falsy string).
monkeypatch.setattr(shutil, "which", lambda name: None)
# Expect None when neither wrapper nor system Maven is found.
codeflash_output = find_maven_executable() # 13.6μs -> 8.93μs (52.2% faster)
def test_ignores_empty_string_from_which(monkeypatch, tmp_path):
# If shutil.which returns an empty string (falsy), function should treat it as not found.
monkeypatch.chdir(tmp_path)
monkeypatch.setattr(shutil, "which", lambda name: "")
# Expect None because empty string is falsy and treated like "not found".
codeflash_output = find_maven_executable() # 13.3μs -> 8.87μs (49.5% faster)
def test_directory_named_mvnw_counts_as_exists(tmp_path, monkeypatch):
# Create a directory named "mvnw" (os.path.exists returns True for directories).
monkeypatch.chdir(tmp_path)
(tmp_path / "mvnw").mkdir()
# The function checks os.path.exists only, so it should return "./mvnw" even if it's a directory.
codeflash_output = find_maven_executable() # 5.50μs -> 3.11μs (77.1% faster)
def test_symlink_wrapper_to_existing_target(tmp_path, monkeypatch):
# Create a real target file and a symlink named "mvnw" pointing to it.
monkeypatch.chdir(tmp_path)
target = tmp_path / "real_mvnw"
target.write_text("#!/bin/sh\necho real\n")
symlink = tmp_path / "mvnw"
# Create a symlink; ensure platform supports it (on Windows this may require admin, so skip if not possible).
try:
symlink.symlink_to(target)
except (OSError, NotImplementedError):
pytest.skip("Symlinks not supported in this environment")
# The symlink points to an existing file, so os.path.exists should be True and wrapper detected.
codeflash_output = find_maven_executable() # 7.11μs -> 4.56μs (56.1% faster)
def test_wrapper_has_precedence_over_system_mvn(monkeypatch, tmp_path):
# Even if shutil.which finds a system mvn, a wrapper present in cwd must take precedence.
monkeypatch.chdir(tmp_path)
(tmp_path / "mvnw").write_text("#!/bin/sh\necho mvnw\n")
monkeypatch.setattr(shutil, "which", lambda name: "/usr/local/bin/mvn")
# Confirm wrapper is returned, not the system path.
codeflash_output = find_maven_executable() # 5.59μs -> 3.33μs (68.1% faster)
def test_large_number_of_files_with_wrapper_present(tmp_path, monkeypatch):
# Create many files to simulate a crowded project directory.
monkeypatch.chdir(tmp_path)
# Create 500 dummy files (well under the 1000-element limit).
for i in range(500):
(tmp_path / f"file_{i}.txt").write_text(f"dummy {i}")
# Place the wrapper among many files and confirm detection remains correct.
(tmp_path / "mvnw").write_text("#!/bin/sh\necho mvnw\n")
# The function should still return the wrapper path quickly and correctly.
codeflash_output = find_maven_executable() # 6.15μs -> 3.47μs (77.4% faster)
def test_large_number_of_files_without_wrapper_uses_system_mvn(monkeypatch, tmp_path):
# With many files but no wrapper, the function should fall back to shutil.which.
monkeypatch.chdir(tmp_path)
for i in range(250):
(tmp_path / f"other_{i}.data").write_text("x" * 10)
# Simulate a system Maven found on PATH.
monkeypatch.setattr(shutil, "which", lambda name: r"C:\Program Files\Apache\Maven\bin\mvn.bat" if name == "mvn" else None)
# Return should be the system path provided by shutil.which.
codeflash_output = find_maven_executable() # 22.0μs -> 16.7μs (31.6% faster)
def test_multiple_invocations_return_same_result(tmp_path, monkeypatch):
# Ensure stable behavior across multiple calls with same environment.
monkeypatch.chdir(tmp_path)
(tmp_path / "mvnw").write_text("#!/bin/sh\necho mvnw\n")
codeflash_output = find_maven_executable(); first = codeflash_output # 5.66μs -> 3.30μs (71.7% faster)
codeflash_output = find_maven_executable(); second = codeflash_output # 2.88μs -> 1.66μs (73.5% faster)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.import os
import shutil
import tempfile
from pathlib import Path
from unittest.mock import MagicMock, patch
import pytest
from codeflash.languages.java.build_tools import find_maven_executable
def test_finds_mvnw_in_current_directory():
"""Test that find_maven_executable returns ./mvnw when mvnw exists in current directory."""
with tempfile.TemporaryDirectory() as tmpdir:
original_dir = os.getcwd()
try:
os.chdir(tmpdir)
# Create mvnw file
mvnw_path = os.path.join(tmpdir, "mvnw")
Path(mvnw_path).touch()
codeflash_output = find_maven_executable(); result = codeflash_output
finally:
os.chdir(original_dir)
def test_finds_mvnw_cmd_in_current_directory():
"""Test that find_maven_executable returns mvnw.cmd when mvnw.cmd exists and mvnw does not."""
with tempfile.TemporaryDirectory() as tmpdir:
original_dir = os.getcwd()
try:
os.chdir(tmpdir)
# Create mvnw.cmd file
mvnw_cmd_path = os.path.join(tmpdir, "mvnw.cmd")
Path(mvnw_cmd_path).touch()
codeflash_output = find_maven_executable(); result = codeflash_output
finally:
os.chdir(original_dir)
def test_prefers_mvnw_over_mvnw_cmd():
"""Test that find_maven_executable prefers ./mvnw over mvnw.cmd when both exist."""
with tempfile.TemporaryDirectory() as tmpdir:
original_dir = os.getcwd()
try:
os.chdir(tmpdir)
# Create both mvnw and mvnw.cmd files
Path(os.path.join(tmpdir, "mvnw")).touch()
Path(os.path.join(tmpdir, "mvnw.cmd")).touch()
codeflash_output = find_maven_executable(); result = codeflash_output
finally:
os.chdir(original_dir)
def test_finds_system_maven_when_wrappers_not_present():
"""Test that find_maven_executable finds system Maven when wrappers are not present."""
with tempfile.TemporaryDirectory() as tmpdir:
original_dir = os.getcwd()
try:
os.chdir(tmpdir)
# Mock shutil.which to return a maven path
with patch('shutil.which') as mock_which:
mock_which.return_value = "/usr/bin/mvn"
codeflash_output = find_maven_executable(); result = codeflash_output
mock_which.assert_called_once_with("mvn")
finally:
os.chdir(original_dir)
def test_returns_none_when_no_maven_found():
"""Test that find_maven_executable returns None when no Maven executable is found."""
with tempfile.TemporaryDirectory() as tmpdir:
original_dir = os.getcwd()
try:
os.chdir(tmpdir)
# Mock shutil.which to return None
with patch('shutil.which') as mock_which:
mock_which.return_value = None
codeflash_output = find_maven_executable(); result = codeflash_output
finally:
os.chdir(original_dir)
def test_mvnw_wrapper_takes_priority_over_system_maven():
"""Test that ./mvnw is returned even when system Maven is available."""
with tempfile.TemporaryDirectory() as tmpdir:
original_dir = os.getcwd()
try:
os.chdir(tmpdir)
# Create mvnw file
Path(os.path.join(tmpdir, "mvnw")).touch()
# Mock shutil.which to return a system maven path
with patch('shutil.which') as mock_which:
mock_which.return_value = "/usr/bin/mvn"
codeflash_output = find_maven_executable(); result = codeflash_output
mock_which.assert_not_called()
finally:
os.chdir(original_dir)
def test_mvnw_cmd_takes_priority_over_system_maven():
"""Test that mvnw.cmd is returned even when system Maven is available."""
with tempfile.TemporaryDirectory() as tmpdir:
original_dir = os.getcwd()
try:
os.chdir(tmpdir)
# Create mvnw.cmd file
Path(os.path.join(tmpdir, "mvnw.cmd")).touch()
# Mock shutil.which to return a system maven path
with patch('shutil.which') as mock_which:
mock_which.return_value = "/usr/bin/mvn"
codeflash_output = find_maven_executable(); result = codeflash_output
mock_which.assert_not_called()
finally:
os.chdir(original_dir)
def test_handles_system_maven_with_absolute_path():
"""Test that find_maven_executable correctly returns absolute path for system Maven."""
with tempfile.TemporaryDirectory() as tmpdir:
original_dir = os.getcwd()
try:
os.chdir(tmpdir)
# Mock shutil.which to return an absolute path
with patch('shutil.which') as mock_which:
absolute_path = "/opt/maven/bin/mvn"
mock_which.return_value = absolute_path
codeflash_output = find_maven_executable(); result = codeflash_output
finally:
os.chdir(original_dir)
def test_handles_system_maven_with_relative_path():
"""Test that find_maven_executable correctly returns relative path for system Maven."""
with tempfile.TemporaryDirectory() as tmpdir:
original_dir = os.getcwd()
try:
os.chdir(tmpdir)
# Mock shutil.which to return a relative path
with patch('shutil.which') as mock_which:
relative_path = "./bin/mvn"
mock_which.return_value = relative_path
codeflash_output = find_maven_executable(); result = codeflash_output
finally:
os.chdir(original_dir)
def test_mvnw_exists_as_directory_not_file():
"""Test behavior when 'mvnw' exists but is a directory, not a file."""
with tempfile.TemporaryDirectory() as tmpdir:
original_dir = os.getcwd()
try:
os.chdir(tmpdir)
# Create mvnw as a directory
os.makedirs(os.path.join(tmpdir, "mvnw"))
# Mock shutil.which to return None (so it falls through to system check)
with patch('shutil.which') as mock_which:
mock_which.return_value = None
codeflash_output = find_maven_executable(); result = codeflash_output
finally:
os.chdir(original_dir)
def test_mvnw_cmd_exists_as_directory_not_file():
"""Test behavior when 'mvnw.cmd' exists but is a directory, not a file."""
with tempfile.TemporaryDirectory() as tmpdir:
original_dir = os.getcwd()
try:
os.chdir(tmpdir)
# Create mvnw.cmd as a directory
os.makedirs(os.path.join(tmpdir, "mvnw.cmd"))
# Mock shutil.which to return None
with patch('shutil.which') as mock_which:
mock_which.return_value = None
codeflash_output = find_maven_executable(); result = codeflash_output
finally:
os.chdir(original_dir)
def test_empty_string_from_system_maven():
"""Test handling when shutil.which returns an empty string."""
with tempfile.TemporaryDirectory() as tmpdir:
original_dir = os.getcwd()
try:
os.chdir(tmpdir)
# Mock shutil.which to return an empty string
with patch('shutil.which') as mock_which:
mock_which.return_value = ""
codeflash_output = find_maven_executable(); result = codeflash_output
finally:
os.chdir(original_dir)
def test_whitespace_string_from_system_maven():
"""Test handling when shutil.which returns a whitespace string."""
with tempfile.TemporaryDirectory() as tmpdir:
original_dir = os.getcwd()
try:
os.chdir(tmpdir)
# Mock shutil.which to return a whitespace string
with patch('shutil.which') as mock_which:
mock_which.return_value = " "
codeflash_output = find_maven_executable(); result = codeflash_output
finally:
os.chdir(original_dir)
def test_finds_maven_in_directory_with_many_files():
"""Test that find_maven_executable works correctly in a directory with many files."""
with tempfile.TemporaryDirectory() as tmpdir:
original_dir = os.getcwd()
try:
os.chdir(tmpdir)
# Create many files in the directory
for i in range(100):
Path(os.path.join(tmpdir, f"file_{i}.txt")).touch()
# Create mvnw
Path(os.path.join(tmpdir, "mvnw")).touch()
codeflash_output = find_maven_executable(); result = codeflash_output
finally:
os.chdir(original_dir)
def test_finds_mvnw_cmd_in_directory_with_many_files():
"""Test that find_maven_executable finds mvnw.cmd in a directory with many files."""
with tempfile.TemporaryDirectory() as tmpdir:
original_dir = os.getcwd()
try:
os.chdir(tmpdir)
# Create many files in the directory
for i in range(100):
Path(os.path.join(tmpdir, f"file_{i}.txt")).touch()
# Create mvnw.cmd
Path(os.path.join(tmpdir, "mvnw.cmd")).touch()
codeflash_output = find_maven_executable(); result = codeflash_output
finally:
os.chdir(original_dir)
def test_performance_with_no_maven_in_large_directory():
"""Test that find_maven_executable performs well when returning None in a large directory."""
with tempfile.TemporaryDirectory() as tmpdir:
original_dir = os.getcwd()
try:
os.chdir(tmpdir)
# Create many files to simulate a large project directory
for i in range(500):
Path(os.path.join(tmpdir, f"file_{i}.txt")).touch()
# Mock shutil.which to return None
with patch('shutil.which') as mock_which:
mock_which.return_value = None
codeflash_output = find_maven_executable(); result = codeflash_output
finally:
os.chdir(original_dir)
def test_multiple_calls_return_consistent_results():
"""Test that multiple calls to find_maven_executable return consistent results."""
with tempfile.TemporaryDirectory() as tmpdir:
original_dir = os.getcwd()
try:
os.chdir(tmpdir)
# Create mvnw
Path(os.path.join(tmpdir, "mvnw")).touch()
# Call find_maven_executable multiple times
results = [find_maven_executable() for _ in range(50)]
finally:
os.chdir(original_dir)
def test_switching_directories_finds_correct_maven():
"""Test that find_maven_executable correctly finds Maven when switching directories."""
with tempfile.TemporaryDirectory() as tmpdir1:
with tempfile.TemporaryDirectory() as tmpdir2:
original_dir = os.getcwd()
try:
# First directory with mvnw
os.chdir(tmpdir1)
Path(os.path.join(tmpdir1, "mvnw")).touch()
codeflash_output = find_maven_executable(); result1 = codeflash_output
# Second directory without mvnw
os.chdir(tmpdir2)
with patch('shutil.which') as mock_which:
mock_which.return_value = "/usr/bin/mvn"
codeflash_output = find_maven_executable(); result2 = codeflash_output
finally:
os.chdir(original_dir)
def test_finds_system_maven_with_long_path():
"""Test that find_maven_executable handles system Maven with a very long path."""
with tempfile.TemporaryDirectory() as tmpdir:
original_dir = os.getcwd()
try:
os.chdir(tmpdir)
# Create a very long path for Maven
long_path = "/very/long/path/" + "subdirectory/" * 50 + "mvn"
with patch('shutil.which') as mock_which:
mock_which.return_value = long_path
codeflash_output = find_maven_executable(); result = codeflash_output
finally:
os.chdir(original_dir)
def test_finds_system_maven_with_special_characters_in_path():
"""Test that find_maven_executable handles system Maven with special characters in path."""
with tempfile.TemporaryDirectory() as tmpdir:
original_dir = os.getcwd()
try:
os.chdir(tmpdir)
# Create a path with special characters
special_path = "/opt/maven-3.8.1/bin/mvn"
with patch('shutil.which') as mock_which:
mock_which.return_value = special_path
codeflash_output = find_maven_executable(); result = codeflash_output
finally:
os.chdir(original_dir)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.from codeflash.languages.java.build_tools import find_maven_executable
def test_find_maven_executable():
find_maven_executable()🔎 Click to see Concolic Coverage Tests
| Test File::Test Function | Original ⏱️ | Optimized ⏱️ | Speedup |
|---|---|---|---|
codeflash_concolic_34v0t72u/tmp1x2llvvp/test_concolic_coverage.py::test_find_maven_executable |
81.3μs | 78.4μs | 3.65%✅ |
To test or edit this optimization locally git merge codeflash/optimize-pr1199-2026-02-01T23.07.44
| if os.path.exists("mvnw"): | |
| return "./mvnw" | |
| if os.path.exists("mvnw.cmd"): | |
| if os.access("mvnw", os.F_OK): | |
| return "./mvnw" | |
| if os.access("mvnw.cmd", os.F_OK): |
| while pos < len(content): | ||
| next_open = content.find(open_tag, pos) | ||
| next_open_short = content.find(open_tag_short, pos) | ||
| next_close = content.find(close_tag, pos) | ||
|
|
||
| if next_close == -1: | ||
| return -1 | ||
|
|
||
| # Find the earliest opening tag (if any) | ||
| candidates = [x for x in [next_open, next_open_short] if x != -1 and x < next_close] | ||
| next_open_any = min(candidates) if candidates else len(content) + 1 | ||
|
|
||
| if next_open_any < next_close: | ||
| # Found opening tag first - nested tag | ||
| depth += 1 | ||
| pos = next_open_any + 1 | ||
| else: | ||
| # Found closing tag first | ||
| depth -= 1 | ||
| if depth == 0: | ||
| return next_close | ||
| pos = next_close + len(close_tag) | ||
|
|
There was a problem hiding this comment.
⚡️Codeflash found 84% (0.84x) speedup for _find_closing_tag in codeflash/languages/java/build_tools.py
⏱️ Runtime : 1.01 milliseconds → 548 microseconds (best of 233 runs)
📝 Explanation and details
The optimized code achieves an 83% speedup (from 1.01ms to 548μs) by fundamentally changing the search strategy from multiple independent substring searches to a single progressive scan.
Key Optimization:
The original code performs three separate content.find() calls per iteration to locate <tag>, <tag , and </tag> patterns, then constructs a candidate list to determine which appears first. This results in redundant scanning of the same content regions multiple times.
The optimized version instead:
- Finds the next
<character once withcontent.find("<", pos) - Uses
content.startswith()at that position to check if it's a relevant opening or closing tag - Eliminates the candidate list construction and min() operation
Why This Is Faster:
- Reduced string searches: One
find("<")call instead of threefind()calls searching for longer patterns - Earlier bailout: When no
<is found, we immediately return -1 without further checks - Eliminated allocations: No list comprehension creating the
candidateslist on each iteration - Better locality:
startswith()checks are O(k) where k is the tag length, performed only once at the found position
Performance Characteristics:
The test results show the optimization excels with:
- Nested same-name tags:
test_large_nested_tags_scalabilityshows 680% speedup (713μs → 91.5μs) for 200 nested levels - Simple structures: Most simple cases show 50-100% speedup (e.g.,
test_basic_single_pair55.9% faster) - Missing closing tags:
test_performance_with_large_string_no_matchshows 745% speedup (13.7μs → 1.62μs)
The optimization performs slightly worse on content with many different tag types at the same level (e.g., test_large_content_simple 90% slower) because it must scan through more < characters that aren't relevant to the target tag. However, the overall runtime improvement in typical XML parsing scenarios (nested same-name tags, sequential scanning) makes this an excellent trade-off.
✅ Correctness verification report:
| Test | Status |
|---|---|
| ⚙️ Existing Unit Tests | 🔘 None Found |
| 🌀 Generated Regression Tests | ✅ 53 Passed |
| ⏪ Replay Tests | 🔘 None Found |
| 🔎 Concolic Coverage Tests | ✅ 3 Passed |
| 📊 Tests Coverage | 100.0% |
🌀 Click to see Generated Regression Tests
from __future__ import annotations
# imports
import pytest # used for our unit tests
from codeflash.languages.java.build_tools import _find_closing_tag
def test_basic_single_pair():
# Basic: single matching pair should return the index of the closing tag
content = "<root>hello</root>"
start = content.find("<root") # position of the opening tag
expected_close = content.find("</root>") # expected position of closing tag
# The function should find the closing tag start index
codeflash_output = _find_closing_tag(content, start, "root") # 2.65μs -> 1.70μs (55.9% faster)
def test_nested_same_tag_simple():
# Nested tags of same name: outer must match its own closing tag, not inner
content = "<a><a>inner</a>outer</a>"
start_outer = content.find("<a>") # first opening tag
# expected closing for outermost is the last occurrence of "</a>"
expected_outer_close = content.rfind("</a>")
codeflash_output = _find_closing_tag(content, start_outer, "a") # 5.10μs -> 2.63μs (93.5% faster)
def test_with_attributes_and_spaces():
# Opening tags with attributes (using "<tag " form) must be recognized as openings
content = "<tag attr='1'>text<tag attr2='2'>inner</tag></tag>"
start = content.find("<tag") # first opening (with attributes)
expected_close = content.rfind("</tag>")
codeflash_output = _find_closing_tag(content, start, "tag") # 5.09μs -> 2.60μs (96.1% faster)
def test_missing_closing_returns_minus_one():
# When a closing tag is missing entirely, the function should return -1
content = "<x>no close here"
start = content.find("<x")
codeflash_output = _find_closing_tag(content, start, "x") # 1.75μs -> 1.36μs (28.7% faster)
def test_similar_tag_names_not_confused():
# Ensure tags with similar names (e.g., <a> vs <ab>) do not confuse matching
content = "<a><ab></ab></a>"
start = content.find("<a")
expected_close = content.find("</a>")
# The function should match the </a> closing tag, not get fooled by <ab>
codeflash_output = _find_closing_tag(content, start, "a") # 2.58μs -> 2.50μs (3.61% faster)
def test_self_closing_tag_returns_minus_one():
# Self-closing tags like <a/> have no corresponding </a>, so result should be -1
content = "<a/>"
start = content.find("<a")
# Even though start points to the tag, there is no closing tag, so expect -1
codeflash_output = _find_closing_tag(content, start, "a") # 1.55μs -> 1.27μs (22.1% faster)
def test_start_pos_not_zero_and_multiple_instances():
# When there are multiple sibling tags, ensure we can target the second one by start_pos
content = "pre<a>one</a><a>two</a>post"
# locate the second <a> by searching after the first one
first = content.find("<a>")
second = content.find("<a>", first + 1)
expected_close_second = content.find("</a>", second)
# The function should find the closing tag corresponding to the second opening
codeflash_output = _find_closing_tag(content, second, "a") # 2.35μs -> 1.43μs (64.3% faster)
def test_open_tag_with_space_only_and_plain_variant_later():
# If only an open_tag_short appears (i.e., "<tag " with attributes) before a closing,
# the algorithm must still count it as an opening.
content = "<b attr=1><b>inner</b></b>"
start = content.find("<b")
# ensure that the outer closing is matched
expected_close_outer = content.rfind("</b>")
codeflash_output = _find_closing_tag(content, start, "b") # 4.91μs -> 2.40μs (105% faster)
def test_partial_start_pos_inside_opening_still_finds_closing():
# If start_pos is slightly offset (caller error), the code still attempts to find a closing.
# This ensures the function is somewhat robust to non-zero offsets inside the opening tag.
content = "<a>text</a>"
actual_open = content.find("<a>")
# pick a start_pos one character after the '<' (inside the opening)
start_offset = actual_open + 1
# Even if start_pos is not exactly the '<', the function should still locate the closing tag
expected_close = content.find("</a>")
codeflash_output = _find_closing_tag(content, start_offset, "a") # 2.36μs -> 1.44μs (63.8% faster)
def test_multiple_opening_variants_only_open_tag_short_exists():
# Only "<tag " variant exists (no plain "<tag>") - ensure detection of nested openings works
content = "<div class='x'><div id='y'></div></div>"
start = content.find("<div")
expected_close = content.rfind("</div>")
codeflash_output = _find_closing_tag(content, start, "div") # 4.86μs -> 2.60μs (86.5% faster)
def test_large_nested_tags_scalability():
# Large-scale nested tags to test stack/depth handling but keep under 1000 elements.
# Create 200 nested tags: <t><t>...x...</t></t>...
depth = 200
open_tags = "<t>" * depth
close_tags = "</t>" * depth
content = open_tags + "X" + close_tags
# start position of the outermost opening tag
start = content.find("<t")
# The closing index for the outermost is the last </t>
expected_outer_close = content.rfind("</t>")
# The function should handle many nested levels and return the outermost closing index
codeflash_output = _find_closing_tag(content, start, "t") # 713μs -> 91.5μs (680% faster)
def test_interleaved_other_tags_do_not_affect_depth():
# Tags of other names between nested tags should not affect counting for the target tag_name.
content = "<x><a><b></b><a><b></b></a></a></x>"
# There are nested <a> tags with other tags interleaved; find the outermost <a>
start = content.find("<a")
# expected closing is the last </a> corresponding to the outermost
expected_close = content.rfind("</a>")
codeflash_output = _find_closing_tag(content, start, "a") # 5.06μs -> 3.96μs (27.8% faster)
def test_no_opening_tag_at_start_pos_returns_minus_one_or_misleading():
# If start_pos points past any opening tag (e.g., at end of content), the function should return -1
content = "<z></z>"
# choose a start_pos beyond content length to simulate incorrect caller input
start = len(content) + 5
# Since pos will be >= len(content), the while loop will not execute and -1 is returned
codeflash_output = _find_closing_tag(content, start, "z") # 1.12μs -> 1.28μs (12.5% slower)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.import pytest
from codeflash.languages.java.build_tools import _find_closing_tag
def test_simple_single_tag():
"""Test finding closing tag for a simple tag with no nesting."""
content = "<root>content</root>"
codeflash_output = _find_closing_tag(content, 0, "root"); result = codeflash_output # 2.75μs -> 1.78μs (54.0% faster)
def test_simple_tag_with_content():
"""Test finding closing tag for a tag containing text content."""
content = "<div>Hello World</div>"
codeflash_output = _find_closing_tag(content, 0, "div"); result = codeflash_output # 2.67μs -> 1.81μs (47.5% faster)
def test_tag_with_whitespace_content():
"""Test finding closing tag when content contains whitespace."""
content = "<span> </span>"
codeflash_output = _find_closing_tag(content, 0, "span"); result = codeflash_output # 2.67μs -> 1.73μs (53.8% faster)
def test_empty_tag():
"""Test finding closing tag for an empty tag."""
content = "<empty></empty>"
codeflash_output = _find_closing_tag(content, 0, "empty"); result = codeflash_output # 2.58μs -> 1.63μs (57.6% faster)
def test_tag_with_attributes():
"""Test finding closing tag for a tag with attributes."""
content = '<element class="test">content</element>'
codeflash_output = _find_closing_tag(content, 0, "element"); result = codeflash_output # 2.58μs -> 1.68μs (53.6% faster)
def test_tag_with_multiple_attributes():
"""Test finding closing tag for a tag with multiple attributes."""
content = '<div id="main" class="container">text</div>'
codeflash_output = _find_closing_tag(content, 0, "div"); result = codeflash_output # 2.70μs -> 1.79μs (50.3% faster)
def test_no_closing_tag():
"""Test when closing tag is missing - should return -1."""
content = "<root>content"
codeflash_output = _find_closing_tag(content, 0, "root"); result = codeflash_output # 1.79μs -> 1.42μs (26.2% faster)
def test_nested_tags_one_level():
"""Test finding closing tag with one level of nesting."""
content = "<parent><child></child></parent>"
codeflash_output = _find_closing_tag(content, 0, "parent"); result = codeflash_output # 2.67μs -> 2.67μs (0.000% faster)
def test_nested_tags_multiple_levels():
"""Test finding closing tag with multiple levels of nesting."""
content = "<a><b><c></c></b></a>"
codeflash_output = _find_closing_tag(content, 0, "a"); result = codeflash_output # 2.75μs -> 3.41μs (19.4% slower)
def test_nested_tags_same_name():
"""Test finding closing tag when nested tags have the same name."""
content = "<div>outer<div>inner</div>text</div>"
codeflash_output = _find_closing_tag(content, 0, "div"); result = codeflash_output # 5.21μs -> 2.62μs (98.5% faster)
def test_nested_tags_same_name_multiple():
"""Test multiple nested tags of the same name."""
content = "<tag>level1<tag>level2</tag>level1</tag>"
codeflash_output = _find_closing_tag(content, 0, "tag"); result = codeflash_output # 4.81μs -> 2.50μs (92.1% faster)
def test_closing_tag_at_end():
"""Test when closing tag is at the very end of content."""
content = "<root>text</root>"
codeflash_output = _find_closing_tag(content, 0, "root"); result = codeflash_output # 2.62μs -> 1.68μs (55.9% faster)
def test_tag_name_is_single_character():
"""Test with single character tag name."""
content = "<a>content</a>"
codeflash_output = _find_closing_tag(content, 0, "a"); result = codeflash_output # 2.57μs -> 1.74μs (47.7% faster)
def test_tag_name_is_long():
"""Test with long tag name."""
content = "<verylongtagnamethatiscomplex>content</verylongtagnamethatiscomplex>"
codeflash_output = _find_closing_tag(content, 0, "verylongtagnamethatiscomplex"); result = codeflash_output # 2.73μs -> 1.78μs (52.8% faster)
def test_tag_with_numbers():
"""Test tag name containing numbers."""
content = "<div2>text</div2>"
codeflash_output = _find_closing_tag(content, 0, "div2"); result = codeflash_output # 2.53μs -> 1.64μs (54.2% faster)
def test_tag_with_hyphens():
"""Test tag name containing hyphens."""
content = "<my-tag>content</my-tag>"
codeflash_output = _find_closing_tag(content, 0, "my-tag"); result = codeflash_output # 2.56μs -> 1.71μs (49.6% faster)
def test_nested_different_tags():
"""Test nested tags with different names."""
content = "<outer><inner>text</inner></outer>"
codeflash_output = _find_closing_tag(content, 0, "outer"); result = codeflash_output # 2.62μs -> 2.79μs (6.08% slower)
def test_multiple_nested_with_attributes():
"""Test nested tags where some have attributes."""
content = '<root id="1"><child class="x">content</child></root>'
codeflash_output = _find_closing_tag(content, 0, "root"); result = codeflash_output # 2.63μs -> 2.58μs (1.93% faster)
def test_tag_with_attribute_containing_tag_like_string():
"""Test tag with attribute value containing tag-like content."""
content = '<div data="<test>">content</div>'
codeflash_output = _find_closing_tag(content, 0, "div"); result = codeflash_output # 2.65μs -> 2.28μs (16.2% faster)
def test_start_pos_not_zero():
"""Test when start_pos is not at the beginning."""
content = "text<root>content</root>more"
codeflash_output = _find_closing_tag(content, 4, "root"); result = codeflash_output # 2.50μs -> 1.70μs (46.4% faster)
def test_deeply_nested_same_tags():
"""Test deeply nested tags with the same name."""
content = "<x><x><x></x></x></x>"
codeflash_output = _find_closing_tag(content, 0, "x"); result = codeflash_output # 6.69μs -> 3.00μs (123% faster)
def test_tag_with_newlines():
"""Test tag with newline characters in content."""
content = "<div>\nline1\nline2\n</div>"
codeflash_output = _find_closing_tag(content, 0, "div"); result = codeflash_output # 2.62μs -> 1.72μs (52.4% faster)
def test_tag_with_tabs():
"""Test tag with tab characters in content."""
content = "<div>\ttab\tcontent\t</div>"
codeflash_output = _find_closing_tag(content, 0, "div"); result = codeflash_output # 2.52μs -> 1.71μs (47.4% faster)
def test_consecutive_opening_tags():
"""Test multiple consecutive opening tags of the same name."""
content = "<span><span>text</span></span>"
codeflash_output = _find_closing_tag(content, 0, "span"); result = codeflash_output # 4.99μs -> 2.56μs (94.5% faster)
def test_tag_after_first_but_before_close():
"""Test when there's another tag between opening and closing."""
content = "<root><other>text</other></root>"
codeflash_output = _find_closing_tag(content, 0, "root"); result = codeflash_output # 2.67μs -> 2.69μs (1.11% slower)
def test_closing_tag_without_corresponding_opening():
"""Test when there's a closing tag but it doesn't match our opening."""
content = "<root>text</other>"
codeflash_output = _find_closing_tag(content, 0, "root"); result = codeflash_output # 1.75μs -> 2.02μs (13.3% slower)
def test_tag_name_with_underscore():
"""Test tag name with underscore characters."""
content = "<my_tag>content</my_tag>"
codeflash_output = _find_closing_tag(content, 0, "my_tag"); result = codeflash_output # 2.63μs -> 1.68μs (56.6% faster)
def test_very_short_content():
"""Test with minimal content - just opening tag."""
content = "<x>"
codeflash_output = _find_closing_tag(content, 0, "x"); result = codeflash_output # 1.68μs -> 1.40μs (20.0% faster)
def test_tag_with_self_closing_like_syntax():
"""Test tag that might look self-closing but isn't."""
content = "<br />content</br>"
codeflash_output = _find_closing_tag(content, 5, "br"); result = codeflash_output # 2.64μs -> 1.72μs (53.5% faster)
def test_large_content_simple():
"""Test with large content size but simple structure."""
# Create content with many nested levels (up to 100 levels)
opening = "".join(f"<tag{i}>" for i in range(100))
closing = "".join(f"</tag{i}>" for i in range(99, -1, -1))
content = opening + "CONTENT" + closing
# Find the closing tag for the first tag
codeflash_output = _find_closing_tag(content, 0, "tag0"); result = codeflash_output # 6.07μs -> 62.7μs (90.3% slower)
def test_large_content_wide_structure():
"""Test with many tags at the same level."""
# Create content with many sibling tags
content = "<root>"
for i in range(100):
content += f"<item{i}>content</item{i}>"
content += "</root>"
# Find the closing tag for root
codeflash_output = _find_closing_tag(content, 0, "root"); result = codeflash_output # 6.57μs -> 63.2μs (89.6% slower)
def test_large_nested_tags_finding_correct_close():
"""Test that with many nested tags, we find the correct closing tag."""
# Create deeply nested structure: <a><b><c>...<z></z>...</c></b></a>
alphabet = "abcdefghijklmnopqrstuvwxyz"
opening = "".join(f"<{char}>" for char in alphabet)
closing = "".join(f"</{char}>" for char in reversed(alphabet))
content = opening + "CORE" + closing
# Find the closing tag for 'a' (the outermost)
codeflash_output = _find_closing_tag(content, 0, "a"); result = codeflash_output # 3.12μs -> 16.8μs (81.4% slower)
def test_large_content_with_many_attributes():
"""Test with large content containing tags with many attributes."""
# Create a tag with many attributes
attributes = ' '.join(f'attr{i}="value{i}"' for i in range(50))
content = f'<root {attributes}>content</root>'
# Find the closing tag
codeflash_output = _find_closing_tag(content, 0, "root"); result = codeflash_output # 4.56μs -> 1.88μs (142% faster)
def test_large_content_mixed_nesting():
"""Test with large content containing mixed nesting patterns."""
# Create content with alternating levels of nesting
content = "<root>"
for i in range(50):
content += f"<level1{i}><level2{i}>content</level2{i}></level1{i}>"
content += "</root>"
# Find the closing tag for root
codeflash_output = _find_closing_tag(content, 0, "root"); result = codeflash_output # 6.81μs -> 62.9μs (89.2% slower)
def test_large_content_same_name_nesting():
"""Test with many nested tags of the same name."""
# Create content with 50 levels of the same tag nested
content = ""
for i in range(50):
content += "<div>"
content += "CONTENT"
for i in range(50):
content += "</div>"
# Find the closing tag for the first div
codeflash_output = _find_closing_tag(content, 0, "div"); result = codeflash_output # 102μs -> 24.2μs (325% faster)
def test_large_content_finding_middle_tag():
"""Test finding a closing tag for a tag in the middle of large content."""
# Create content with multiple root-level tags
content = "<root1>content</root1>"
content += "<root2><nested>content</nested></root2>"
for i in range(50):
content += f"<item{i}>content</item{i}>"
# Find the closing tag for root2 which has nesting
start_pos = content.find("<root2>")
codeflash_output = _find_closing_tag(content, start_pos, "root2"); result = codeflash_output # 3.87μs -> 2.58μs (49.6% faster)
def test_performance_with_large_string_no_match():
"""Test performance when there's no closing tag in large content."""
# Create large content without closing tag
content = "<root>" + "x" * 10000
# Should return -1 efficiently
codeflash_output = _find_closing_tag(content, 0, "root"); result = codeflash_output # 13.7μs -> 1.62μs (745% faster)
def test_large_content_multiple_tag_searches():
"""Test finding closing tags for multiple tags in large content."""
# Create content with nested different tag types
content = "<wrapper>"
for i in range(100):
content += f"<container{i}><item>data</item></container{i}>"
content += "</wrapper>"
# Find the closing tag for wrapper
codeflash_output = _find_closing_tag(content, 0, "wrapper"); result = codeflash_output # 7.97μs -> 123μs (93.5% slower)
def test_large_content_with_special_characters():
"""Test large content with special characters in values."""
# Create content with special characters
special_chars = "!@#$%^&*()_+-=[]{}|;:',.<>?/~`"
content = f"<root data=\"{special_chars * 10}\">content</root>"
# Find the closing tag
codeflash_output = _find_closing_tag(content, 0, "root"); result = codeflash_output # 3.24μs -> 5.34μs (39.4% slower)
def test_large_content_with_xml_entities():
"""Test large content with XML entities."""
# Create content with XML entities
content = "<root>Text with < > & entities</root>"
# Find the closing tag
codeflash_output = _find_closing_tag(content, 0, "root"); result = codeflash_output # 2.69μs -> 1.73μs (54.9% faster)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.from codeflash.languages.java.build_tools import _find_closing_tag
def test__find_closing_tag():
_find_closing_tag('<></>', -1, '')
def test__find_closing_tag_2():
_find_closing_tag('', -2, '')
def test__find_closing_tag_3():
_find_closing_tag('</>', -1, '')🔎 Click to see Concolic Coverage Tests
| Test File::Test Function | Original ⏱️ | Optimized ⏱️ | Speedup |
|---|---|---|---|
codeflash_concolic_34v0t72u/tmpmp8y47yq/test_concolic_coverage.py::test__find_closing_tag |
4.23μs | 2.50μs | 69.5%✅ |
codeflash_concolic_34v0t72u/tmpmp8y47yq/test_concolic_coverage.py::test__find_closing_tag_2 |
1.79μs | 1.44μs | 24.3%✅ |
codeflash_concolic_34v0t72u/tmpmp8y47yq/test_concolic_coverage.py::test__find_closing_tag_3 |
2.48μs | 1.67μs | 47.9%✅ |
To test or edit this optimization locally git merge codeflash/optimize-pr1199-2026-02-01T23.32.35
Click to see suggested changes
| while pos < len(content): | |
| next_open = content.find(open_tag, pos) | |
| next_open_short = content.find(open_tag_short, pos) | |
| next_close = content.find(close_tag, pos) | |
| if next_close == -1: | |
| return -1 | |
| # Find the earliest opening tag (if any) | |
| candidates = [x for x in [next_open, next_open_short] if x != -1 and x < next_close] | |
| next_open_any = min(candidates) if candidates else len(content) + 1 | |
| if next_open_any < next_close: | |
| # Found opening tag first - nested tag | |
| depth += 1 | |
| pos = next_open_any + 1 | |
| else: | |
| # Found closing tag first | |
| depth -= 1 | |
| if depth == 0: | |
| return next_close | |
| pos = next_close + len(close_tag) | |
| len_close = len(close_tag) | |
| # Scan for the next '<' and then determine whether it's an open/close of interest. | |
| while True: | |
| next_lt = content.find("<", pos) | |
| if next_lt == -1: | |
| return -1 | |
| # Check for the relevant closing tag first | |
| if content.startswith(close_tag, next_lt): | |
| # Found closing tag first | |
| depth -= 1 | |
| if depth == 0: | |
| return next_lt | |
| pos = next_lt + len_close | |
| continue | |
| # Check for nested opening tags of the exact forms we consider | |
| if content.startswith(open_tag, next_lt) or content.startswith(open_tag_short, next_lt): | |
| depth += 1 | |
| pos = next_lt + 1 | |
| continue | |
| # Not an open/close we're tracking; move on | |
| pos = next_lt + 1 | |
| part_text = source_bytes[child.start_byte : child.end_byte].decode("utf8") | ||
| parts.append(part_text) | ||
|
|
||
| return " ".join(parts).strip() |
There was a problem hiding this comment.
⚡️Codeflash found 33% (0.33x) speedup for _extract_type_declaration in codeflash/languages/java/context.py
⏱️ Runtime : 133 microseconds → 100 microseconds (best of 15 runs)
📝 Explanation and details
The optimized code achieves a 33% runtime improvement (from 133μs to 100μs) by deferring UTF-8 decoding until after joining all byte slices together, rather than decoding each part individually.
Key Optimization:
The original code decoded each child node's byte slice immediately:
part_text = source_bytes[child.start_byte : child.end_byte].decode("utf8")
parts.append(part_text)
return " ".join(parts).strip()The optimized code collects raw byte slices first, then performs a single decode operation:
parts.append(source_bytes[child.start_byte : child.end_byte])
return b" ".join(parts).decode("utf8").strip()Why This is Faster:
- Reduced decode operations: Instead of calling
decode("utf8")once per child node (~527 times in profiled runs), the optimization calls it just once on the final joined bytes - Byte-level joining:
b" ".join()on bytes is faster than" ".join()on strings, as it operates on raw bytes without character encoding overhead - Better memory efficiency: Avoids creating intermediate string objects for each part
Performance Impact by Test Case:
The optimization shows particularly strong gains on tests with many tokens:
- 37.6% faster on large-scale test with 500 tokens
- 15-16% faster on typical multi-token declarations (interface, enum, unknown types)
- Neutral/slight regression on trivial cases (empty children) where the overhead is negligible
Line Profiler Evidence:
The bottleneck shifted from line 27 in the original (34.3% of time spent on decode + slice) to line 26 in the optimized version (44.2% on append only, but with 23% less total time overall). The single decode at return now takes 3.1% vs the original's 23.2% spent on multiple appends of decoded strings.
This optimization is particularly valuable for parsing Java files with complex type declarations containing many modifiers, annotations, and generic type parameters.
✅ Correctness verification report:
| Test | Status |
|---|---|
| ⚙️ Existing Unit Tests | 🔘 None Found |
| 🌀 Generated Regression Tests | ✅ 8 Passed |
| ⏪ Replay Tests | 🔘 None Found |
| 🔎 Concolic Coverage Tests | 🔘 None Found |
| 📊 Tests Coverage | 100.0% |
🌀 Click to see Generated Regression Tests
from __future__ import annotations
from types import \
SimpleNamespace # used to create lightweight node-like objects
# imports
import pytest # used for our unit tests
from codeflash.languages.java.context import _extract_type_declaration
from tree_sitter import Node
# Helper utilities for tests ---------------------------------------------------
def _make_children_from_tokens_and_body(source: bytes, token_texts: list[str], body_index: int | None, body_type_name: str):
"""
Construct a list of SimpleNamespace children where each token corresponds to a
slice in `source`. Tokens are expected to appear in `source` separated by a single
space. `body_index` indicates the index in token_texts at which a body node should
be inserted; if None, no body node is inserted.
Each produced child has attributes: type, start_byte, end_byte.
"""
children = []
# locate tokens sequentially in source to compute byte offsets
offset = 0
# Copy token_texts to avoid mutating caller's list
for idx, token in enumerate(token_texts):
# find token starting at or after offset
token_bytes = token.encode("utf8")
pos = source.find(token_bytes, offset)
if pos == -1:
raise ValueError(f"Token {token!r} not found in source (from offset {offset}).")
start = pos
end = pos + len(token_bytes)
children.append(SimpleNamespace(type="token", start_byte=start, end_byte=end))
offset = end + 1 # assume tokens separated by at least one byte (space)
# Insert body node if requested. Body will cover from the start of the token at body_index to end of source
if body_index is not None:
# Determine where the body token starts; it should be the token at body_index
if not (0 <= body_index < len(children)):
# if body_index points past tokens, place body at the end
body_start = len(source)
else:
body_start = children[body_index].start_byte
body_child = SimpleNamespace(type=body_type_name, start_byte=body_start, end_byte=len(source))
# place body child at the end of the children list (function only checks type and breaks)
children.append(body_child)
return children
def test_interface_declaration_stops_before_interface_body():
# Interface should use 'interface_body' as the body node name and stop before it.
source_str = "public interface MyInterface extends BaseInterface { void foo(); }"
source = source_str.encode("utf8")
tokens = ["public", "interface", "MyInterface", "extends", "BaseInterface"]
# body_index points to the token position where we consider the body starts (token count)
children = _make_children_from_tokens_and_body(source, tokens, body_index=5, body_type_name="interface_body")
node = SimpleNamespace(children=children)
codeflash_output = _extract_type_declaration(node, source, "interface"); decl = codeflash_output # 3.67μs -> 3.18μs (15.4% faster)
def test_enum_without_body_returns_all_parts():
# If no enum_body node exists among children, function should not break early and should include all parts.
source_str = "public enum Color RED GREEN BLUE"
source = source_str.encode("utf8")
tokens = ["public", "enum", "Color"]
# Do not insert a body node. The function should return everything from the supplied children.
children = _make_children_from_tokens_and_body(source, tokens, body_index=None, body_type_name="enum_body")
node = SimpleNamespace(children=children)
codeflash_output = _extract_type_declaration(node, source, "enum"); decl = codeflash_output # 2.81μs -> 2.54μs (10.2% faster)
def test_empty_children_returns_empty_string():
# Edge case: type_node has no children -> return empty string (after join & strip)
node = SimpleNamespace(children=[])
source = b""
codeflash_output = _extract_type_declaration(node, source, "class"); decl = codeflash_output # 1.32μs -> 1.34μs (1.49% slower)
def test_unknown_type_kind_defaults_to_class_body():
# If type_kind is unknown, body_type defaults to 'class_body'
source_str = "myModifier customType Foo extends Bar { body }"
source = source_str.encode("utf8")
tokens = ["myModifier", "customType", "Foo", "extends", "Bar"]
# Insert a 'class_body' child so unknown maps to class_body and the function stops before it
children = _make_children_from_tokens_and_body(source, tokens, body_index=5, body_type_name="class_body")
node = SimpleNamespace(children=children)
codeflash_output = _extract_type_declaration(node, source, "unknown_kind"); decl = codeflash_output # 3.76μs -> 3.23μs (16.5% faster)
def test_child_with_empty_slice_produces_empty_segment():
# If a child has start_byte == end_byte, that yields an empty decoded string.
# The function will include it as an element; the final join will contain extra space for it.
# Construct source and children manually where one child corresponds to an empty slice.
source_str = "public class MyClass"
source = source_str.encode("utf8")
# Create two real children for 'public' and 'class' and a third child that's empty (start=end)
# The third child will contribute an empty string and show up as an additional space once joined.
# We then append the name child and a body to stop before.
public_pos = source.find(b"public")
class_pos = source.find(b"class")
name_pos = source.find(b"MyClass")
# children as SimpleNamespace objects
children = [
SimpleNamespace(type="token", start_byte=public_pos, end_byte=public_pos + len(b"public")),
SimpleNamespace(type="token", start_byte=class_pos, end_byte=class_pos + len(b"class")),
SimpleNamespace(type="token", start_byte=10, end_byte=10), # empty slice in the middle
SimpleNamespace(type="token", start_byte=name_pos, end_byte=name_pos + len(b"MyClass")),
SimpleNamespace(type="class_body", start_byte=name_pos + len(b"MyClass") + 1, end_byte=len(source)),
]
node = SimpleNamespace(children=children)
codeflash_output = _extract_type_declaration(node, source, "class"); decl = codeflash_output # 3.32μs -> 2.87μs (15.7% faster)
def test_large_number_of_tokens_stops_at_body_and_scales_correctly():
# Large scale test with many tokens (but under 1000).
# Ensure the function correctly concatenates many parts and stops at the body node.
n = 500 # number of tokens to include before body
tokens = [f"T{i}" for i in range(n)]
# Build source: tokens separated by spaces, then a body starting with '{'
source_str = " ".join(tokens) + " {" + " body" + " }"
source = source_str.encode("utf8")
# Construct children corresponding to tokens and then the body node
children = _make_children_from_tokens_and_body(source, tokens, body_index=n, body_type_name="class_body")
node = SimpleNamespace(children=children)
codeflash_output = _extract_type_declaration(node, source, "class"); decl = codeflash_output # 113μs -> 82.4μs (37.6% faster)
# The declaration should be exactly the tokens joined by single spaces
expected = " ".join(tokens)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.import pytest
from codeflash.languages.java.context import _extract_type_declaration
from tree_sitter import Language, Node, Parser
# Helper function to create a tree-sitter node for testing
def _get_parser():
"""Create and return a tree-sitter parser for Java."""
JAVA_LANGUAGE = Language("build/my-languages.so", "java")
parser = Parser()
parser.set_language(JAVA_LANGUAGE)
return parser
def _parse_java_code(code: str) -> Node:
"""Parse Java code and return the root node."""
parser = _get_parser()
tree = parser.parse(code.encode("utf8"))
return tree.root_node
def _find_type_node(root: Node, type_kind: str) -> Node:
"""Find the first type declaration node of the given kind."""
def traverse(node: Node) -> Node | None:
if node.type == type_kind:
return node
for child in node.children:
result = traverse(child)
if result:
return result
return None
return traverse(root)
def test_empty_class_name():
"""Test that function handles class nodes properly (tree-sitter should parse valid Java)."""
code = "public class {} "
To test or edit this optimization locally git merge codeflash/optimize-pr1199-2026-02-02T00.37.05
| part_text = source_bytes[child.start_byte : child.end_byte].decode("utf8") | |
| parts.append(part_text) | |
| return " ".join(parts).strip() | |
| parts.append(source_bytes[child.start_byte : child.end_byte]) | |
| return b" ".join(parts).decode("utf8").strip() |
…2026-02-03T08.18.57 ⚡️ Speed up function `_add_behavior_instrumentation` by 22% in PR #1199 (`omni-java`)
| body_node = node.child_by_field_name("body") | ||
| if body_node: | ||
| for child in body_node.children: | ||
| self._walk_tree_for_classes(child, source_bytes, classes, is_inner=True) |
There was a problem hiding this comment.
⚡️Codeflash found 23% (0.23x) speedup for JavaAnalyzer.find_classes in codeflash/languages/java/parser.py
⏱️ Runtime : 8.35 milliseconds → 6.76 milliseconds (best of 219 runs)
📝 Explanation and details
The optimized code achieves a 23% runtime improvement (8.35ms → 6.76ms) by strategically reducing unnecessary recursive calls when traversing the Java abstract syntax tree.
Key Optimization
The critical change occurs in the inner class detection logic within _walk_tree_for_classes. When processing a class body, the original code recursively explored every child node (1,117 recursive calls), regardless of type:
# Original: recurses on ALL children
for child in body_node.children:
self._walk_tree_for_classes(child, source_bytes, classes, is_inner=True)The optimized version adds a type filter before recursing, only processing nodes that are actual class/interface/enum declarations:
# Optimized: recurses only on class-like declarations
for child in body_node.children:
if child.type in ("class_declaration", "interface_declaration", "enum_declaration"):
self._walk_tree_for_classes(child, source_bytes, classes, is_inner=True)Why This Works
In Java ASTs, class bodies contain many node types (field declarations, method declarations, etc.) that cannot contain nested classes. By filtering early, we avoid descending into irrelevant subtrees. Line profiler data shows this reduces the recursive call count dramatically:
- Original: 6,590 type checks, 1,117 inner-class recursive calls
- Optimized: 513 type checks, 68 inner-class recursive calls
This ~94% reduction in inner-class recursion (1,117 → 68) eliminates wasted traversal through non-class nodes.
Performance Impact by Test Case
The optimization particularly excels when Java code contains:
- Large method bodies: 73% faster on classes with 100 methods (3.34ms → 1.93ms)
- Complex class content: 20% faster on classes with multiple fields and methods
- Many inner classes: 3-4% faster across nested class scenarios
Even simple cases benefit from reduced overhead (2-4% improvements), demonstrating consistent gains across diverse Java codebases. The optimization is especially valuable when parsing large Java files or in hot paths where this parser is called repeatedly.
✅ Correctness verification report:
| Test | Status |
|---|---|
| ⚙️ Existing Unit Tests | 🔘 None Found |
| 🌀 Generated Regression Tests | ✅ 121 Passed |
| ⏪ Replay Tests | 🔘 None Found |
| 🔎 Concolic Coverage Tests | 🔘 None Found |
| 📊 Tests Coverage | 100.0% |
🌀 Click to see Generated Regression Tests
import pytest
from codeflash.languages.java.parser import JavaAnalyzer, JavaClassNode
class TestJavaAnalyzerFindClassesBasic:
"""Test basic functionality of JavaAnalyzer.find_classes."""
def test_simple_public_class(self):
"""Test finding a simple public class definition."""
source = "public class MyClass {}"
analyzer = JavaAnalyzer()
codeflash_output = analyzer.find_classes(source); result = codeflash_output # 27.6μs -> 26.9μs (2.53% faster)
def test_simple_class_without_modifiers(self):
"""Test finding a class without any modifiers."""
source = "class SimpleClass {}"
analyzer = JavaAnalyzer()
codeflash_output = analyzer.find_classes(source); result = codeflash_output # 24.0μs -> 23.1μs (4.04% faster)
def test_multiple_top_level_classes(self):
"""Test finding multiple top-level classes in the same file."""
source = """
public class FirstClass {}
class SecondClass {}
public class ThirdClass {}
"""
analyzer = JavaAnalyzer()
codeflash_output = analyzer.find_classes(source); result = codeflash_output # 46.5μs -> 45.0μs (3.34% faster)
names = [cls.name for cls in result]
def test_class_with_extends(self):
"""Test finding a class that extends another class."""
source = "public class Child extends Parent {}"
analyzer = JavaAnalyzer()
codeflash_output = analyzer.find_classes(source); result = codeflash_output # 29.2μs -> 28.6μs (2.20% faster)
def test_class_with_implements(self):
"""Test finding a class that implements an interface."""
source = "public class MyClass implements MyInterface {}"
analyzer = JavaAnalyzer()
codeflash_output = analyzer.find_classes(source); result = codeflash_output # 29.9μs -> 29.0μs (3.15% faster)
def test_class_with_multiple_implements(self):
"""Test finding a class that implements multiple interfaces."""
source = "public class MyClass implements Interface1, Interface2, Interface3 {}"
analyzer = JavaAnalyzer()
codeflash_output = analyzer.find_classes(source); result = codeflash_output # 34.2μs -> 33.3μs (2.53% faster)
def test_abstract_class(self):
"""Test finding an abstract class."""
source = "public abstract class AbstractClass {}"
analyzer = JavaAnalyzer()
codeflash_output = analyzer.find_classes(source); result = codeflash_output # 26.8μs -> 26.1μs (2.85% faster)
def test_final_class(self):
"""Test finding a final class."""
source = "public final class FinalClass {}"
analyzer = JavaAnalyzer()
codeflash_output = analyzer.find_classes(source); result = codeflash_output # 26.2μs -> 25.3μs (3.44% faster)
def test_interface_declaration(self):
"""Test finding an interface declaration."""
source = "public interface MyInterface {}"
analyzer = JavaAnalyzer()
codeflash_output = analyzer.find_classes(source); result = codeflash_output # 25.4μs -> 24.5μs (3.80% faster)
def test_enum_declaration(self):
"""Test finding an enum declaration."""
source = "public enum MyEnum {}"
analyzer = JavaAnalyzer()
codeflash_output = analyzer.find_classes(source); result = codeflash_output # 24.9μs -> 24.2μs (2.90% faster)
def test_class_with_body_content(self):
"""Test finding a class with various body content."""
source = """
public class ClassWithContent {
private int field;
public void method() {}
}
"""
analyzer = JavaAnalyzer()
codeflash_output = analyzer.find_classes(source); result = codeflash_output # 46.0μs -> 38.2μs (20.7% faster)
class TestJavaAnalyzerFindClassesEdgeCases:
"""Test edge cases and unusual scenarios for JavaAnalyzer.find_classes."""
def test_empty_source_code(self):
"""Test with empty source code."""
source = ""
analyzer = JavaAnalyzer()
codeflash_output = analyzer.find_classes(source); result = codeflash_output # 7.83μs -> 7.74μs (1.16% faster)
def test_source_with_only_comments(self):
"""Test with source code containing only comments."""
source = """
// This is a comment
/* This is a block comment */
"""
analyzer = JavaAnalyzer()
codeflash_output = analyzer.find_classes(source); result = codeflash_output # 12.3μs -> 11.9μs (2.86% faster)
def test_inner_class_detection(self):
"""Test finding inner classes within a class."""
source = """
public class OuterClass {
public class InnerClass {}
}
"""
analyzer = JavaAnalyzer()
codeflash_output = analyzer.find_classes(source); result = codeflash_output # 38.5μs -> 37.8μs (1.91% faster)
names = [cls.name for cls in result]
def test_multiple_inner_classes(self):
"""Test finding multiple inner classes."""
source = """
public class OuterClass {
public class InnerClass1 {}
private class InnerClass2 {}
protected static class InnerClass3 {}
}
"""
analyzer = JavaAnalyzer()
codeflash_output = analyzer.find_classes(source); result = codeflash_output # 59.6μs -> 57.6μs (3.48% faster)
def test_nested_inner_classes(self):
"""Test finding deeply nested inner classes."""
source = """
public class Level1 {
public class Level2 {
public class Level3 {}
}
}
"""
analyzer = JavaAnalyzer()
codeflash_output = analyzer.find_classes(source); result = codeflash_output # 46.1μs -> 44.5μs (3.55% faster)
def test_class_with_extends_and_implements(self):
"""Test class with both extends and implements."""
source = "public class Child extends Parent implements Interface1, Interface2 {}"
analyzer = JavaAnalyzer()
codeflash_output = analyzer.find_classes(source); result = codeflash_output # 36.1μs -> 35.4μs (1.89% faster)
def test_static_inner_class(self):
"""Test finding a static inner class."""
source = """
public class Outer {
public static class StaticInner {}
}
"""
analyzer = JavaAnalyzer()
codeflash_output = analyzer.find_classes(source); result = codeflash_output # 38.4μs -> 37.1μs (3.43% faster)
static_inner = [cls for cls in result if cls.name == "StaticInner"][0]
def test_class_name_with_underscores(self):
"""Test class names containing underscores."""
source = "public class My_Class_Name {}"
analyzer = JavaAnalyzer()
codeflash_output = analyzer.find_classes(source); result = codeflash_output # 25.1μs -> 24.5μs (2.41% faster)
def test_class_name_with_numbers(self):
"""Test class names containing numbers."""
source = "public class MyClass123 {}"
analyzer = JavaAnalyzer()
codeflash_output = analyzer.find_classes(source); result = codeflash_output # 24.9μs -> 24.3μs (2.14% faster)
def test_abstract_final_class(self):
"""Test a class with both abstract and final modifiers."""
source = "public abstract final class WeirdClass {}"
analyzer = JavaAnalyzer()
codeflash_output = analyzer.find_classes(source); result = codeflash_output # 27.6μs -> 26.7μs (3.37% faster)
def test_class_start_and_end_lines(self):
"""Test that start and end line numbers are properly recorded."""
source = """
public class MyClass {
private int x;
}
"""
analyzer = JavaAnalyzer()
codeflash_output = analyzer.find_classes(source); result = codeflash_output # 35.1μs -> 30.9μs (13.7% faster)
def test_class_source_text_captured(self):
"""Test that the source text of the class is captured."""
source = "public class MyClass {}"
analyzer = JavaAnalyzer()
codeflash_output = analyzer.find_classes(source); result = codeflash_output # 24.8μs -> 24.0μs (3.38% faster)
def test_whitespace_variations(self):
"""Test classes with various whitespace patterns."""
source = """
public class MyClass { }
public\tclass\tAnotherClass\t{ }
"""
analyzer = JavaAnalyzer()
codeflash_output = analyzer.find_classes(source); result = codeflash_output # 37.7μs -> 36.5μs (3.07% faster)
def test_interface_with_extends(self):
"""Test interface extending another interface."""
source = "public interface ChildInterface extends ParentInterface {}"
analyzer = JavaAnalyzer()
codeflash_output = analyzer.find_classes(source); result = codeflash_output # 29.0μs -> 28.1μs (3.10% faster)
def test_enum_with_values(self):
"""Test enum with values."""
source = "public enum MyEnum { VALUE1, VALUE2, VALUE3; }"
analyzer = JavaAnalyzer()
codeflash_output = analyzer.find_classes(source); result = codeflash_output # 33.3μs -> 30.3μs (10.0% faster)
def test_generic_class_declaration(self):
"""Test class with generic type parameters."""
source = "public class GenericClass<T> {}"
analyzer = JavaAnalyzer()
codeflash_output = analyzer.find_classes(source); result = codeflash_output # 26.8μs -> 26.2μs (2.52% faster)
def test_class_with_annotations(self):
"""Test class with annotations."""
source = """
@Deprecated
@FunctionalInterface
public class AnnotatedClass {}
"""
analyzer = JavaAnalyzer()
codeflash_output = analyzer.find_classes(source); result = codeflash_output # 36.4μs -> 35.4μs (2.86% faster)
def test_mixed_inner_and_outer_classes(self):
"""Test mix of inner and outer classes."""
source = """
public class Outer1 {
public class Inner1 {}
}
public class Outer2 {}
"""
analyzer = JavaAnalyzer()
codeflash_output = analyzer.find_classes(source); result = codeflash_output # 47.2μs -> 46.1μs (2.35% faster)
def test_private_inner_class(self):
"""Test finding a private inner class."""
source = """
public class Outer {
private class Private {}
}
"""
analyzer = JavaAnalyzer()
codeflash_output = analyzer.find_classes(source); result = codeflash_output # 36.4μs -> 35.2μs (3.50% faster)
private_class = [cls for cls in result if cls.name == "Private"][0]
class TestJavaAnalyzerFindClassesLargeScale:
"""Test JavaAnalyzer.find_classes with large-scale inputs."""
def test_many_top_level_classes(self):
"""Test performance with many top-level classes."""
# Generate 100 class definitions
source_lines = []
for i in range(100):
source_lines.append(f"public class Class{i} {{}}")
source = "\n".join(source_lines)
analyzer = JavaAnalyzer()
codeflash_output = analyzer.find_classes(source); result = codeflash_output # 816μs -> 780μs (4.57% faster)
# Verify names are all unique and correct
names = [cls.name for cls in result]
def test_deeply_nested_inner_classes(self):
"""Test performance with deeply nested inner classes."""
# Create a deeply nested structure (10 levels deep)
source = "public class Level0 {\n"
for i in range(1, 10):
source += " " * i + f"public class Level{i} {{\n"
source += " " * 10 + "}\n" * 10
analyzer = JavaAnalyzer()
codeflash_output = analyzer.find_classes(source); result = codeflash_output # 106μs -> 103μs (3.19% faster)
def test_many_inner_classes_single_outer(self):
"""Test performance with many inner classes in one outer class."""
source = "public class Outer {\n"
for i in range(50):
source += f" public class Inner{i} {{}}\n"
source += "}"
analyzer = JavaAnalyzer()
codeflash_output = analyzer.find_classes(source); result = codeflash_output # 427μs -> 414μs (3.30% faster)
def test_complex_class_hierarchy(self):
"""Test performance with complex class hierarchies."""
source = ""
for i in range(50):
source += f"public class Class{i} extends Class{i-1} implements Interface{i%5} {{}}\n"
analyzer = JavaAnalyzer()
codeflash_output = analyzer.find_classes(source); result = codeflash_output # 699μs -> 682μs (2.46% faster)
# Verify extends relationships
for cls in result:
if cls.name != "Class0":
pass
def test_mixed_declarations_large_scale(self):
"""Test with mixed class, interface, and enum declarations at scale."""
source = ""
for i in range(30):
source += f"public class Class{i} {{}}\n"
source += f"public interface Interface{i} {{}}\n"
source += f"public enum Enum{i} {{}}\n"
analyzer = JavaAnalyzer()
codeflash_output = analyzer.find_classes(source); result = codeflash_output # 746μs -> 723μs (3.24% faster)
def test_class_with_long_source_text(self):
"""Test class with large body content."""
source = "public class LargeClass {\n"
for i in range(100):
source += f" public void method{i}() {{\n"
for j in range(5):
source += f" int var{j} = {i * j};\n"
source += " }\n"
source += "}"
analyzer = JavaAnalyzer()
codeflash_output = analyzer.find_classes(source); result = codeflash_output # 3.34ms -> 1.93ms (73.2% faster)
def test_many_interfaces_implemented(self):
"""Test class implementing many interfaces."""
interfaces = [f"Interface{i}" for i in range(30)]
source = f"public class MultiImpl implements {', '.join(interfaces)} {{}}"
analyzer = JavaAnalyzer()
codeflash_output = analyzer.find_classes(source); result = codeflash_output # 85.6μs -> 83.9μs (2.03% faster)
def test_mixed_modifiers_large_scale(self):
"""Test various modifier combinations at scale."""
modifiers = [
"public",
"private",
"protected",
"abstract",
"final",
"static",
]
source = ""
counter = 0
for mod in modifiers:
source += f"public {mod} class Class{counter} {{}}\n"
counter += 1
analyzer = JavaAnalyzer()
codeflash_output = analyzer.find_classes(source); result = codeflash_output # 78.3μs -> 75.6μs (3.50% faster)
def test_generic_classes_with_bounds(self):
"""Test performance with generic classes having type bounds."""
source = ""
for i in range(20):
source += f"public class GenericClass{i}<T extends Comparable<T>> {{}}\n"
analyzer = JavaAnalyzer()
codeflash_output = analyzer.find_classes(source); result = codeflash_output # 255μs -> 249μs (2.38% faster)
def test_class_attributes_consistency(self):
"""Test that class attributes are consistently populated across many classes."""
source = ""
for i in range(50):
source += f"public class Class{i} {{}}\n"
analyzer = JavaAnalyzer()
codeflash_output = analyzer.find_classes(source); result = codeflash_output # 413μs -> 398μs (3.95% faster)
# Verify all classes have required attributes
for cls in result:
pass
def test_line_and_column_tracking(self):
"""Test that line and column information is accurate for many classes."""
source = ""
for i in range(50):
source += f"public class Class{i} {{}}\n"
analyzer = JavaAnalyzer()
codeflash_output = analyzer.find_classes(source); result = codeflash_output # 413μs -> 396μs (4.34% faster)
# Verify line numbers are in ascending order and reasonable
previous_line = 0
for cls in result:
previous_line = cls.end_line
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.To test or edit this optimization locally git merge codeflash/optimize-pr1199-2026-02-03T10.11.55
| self._walk_tree_for_classes(child, source_bytes, classes, is_inner=True) | |
| if child.type in ("class_declaration", "interface_declaration", "enum_declaration"): | |
| self._walk_tree_for_classes(child, source_bytes, classes, is_inner=True) |
The optimization moved the `inquirer.Path` question construction out of the while-loop and added `@lru_cache(maxsize=1)` to `_get_theme()`, eliminating repeated imports and instantiations of `CodeflashTheme` on every prompt iteration. The profiler shows `_get_theme()` was called 1247 times in the original, each time re-importing `init_config` (~2.2% overhead) and constructing a new theme object (~97.8% overhead, 323 µs per call). Moving the question object outside the loop avoids ~13 µs of reconstruction per iteration, and caching the theme cuts 1246 redundant constructions, yielding a 363% speedup with no functional trade-offs.
⚡️ Codeflash found optimizations for this PR📄 363% (3.63x) speedup for
|
⚡️ Codeflash found optimizations for this PR📄 18% (0.18x) speedup for
|
…2026-03-13T00.56.31 ⚡️ Speed up method `OptimizeRequest.to_payload` by 33% in PR #1199 (`omni-java`)
|
This PR is now faster! 🚀 @claude[bot] accepted my optimizations from: |
…2026-03-13T01.03.06 ⚡️ Speed up method `TestGenRequest.to_payload` by 20% in PR #1199 (`omni-java`)
|
This PR is now faster! 🚀 @claude[bot] accepted my optimizations from: |
⚡️ Codeflash found optimizations for this PR📄 23% (0.23x) speedup for
|
When both package.json and codeflash.toml exist in the directory tree, parse_config_file() only compared package.json against pyproject.toml. Java projects use codeflash.toml, which was never checked — so any package.json in a parent directory would always win, setting the wrong module_root and project_root. Now we find the closest toml config (pyproject.toml or codeflash.toml) and compare its depth against package.json, so a closer codeflash.toml correctly takes priority. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…flash-toml fix: include codeflash.toml in config resolution depth comparison
|
Claude finished @mashraf-222's task in 3s —— View job PR Review:
|
| File | Issue |
|---|---|
init_java.py:355,379,425 |
_get_theme() untyped — no-untyped-call in typed context |
init_java.py:364 |
Returning Any from str | None typed function |
init_java.py:496 |
Unsupported target for indexed assignment ("Item | Container") |
test_runner.py:929,1121,1658 |
subprocess.CompletedProcess missing type parameters |
These are lower priority (no runtime risk in the init_java.py cases; annotation gaps in test_runner.py). Can be addressed in a follow-up.
🟡 Stale Review Threads
- KRRT7 @
instrument_existing_tests.py:730— flagged positional arg ordering bug on commit094d899d. Stale. The call site insupport.py:550now correctly uses keyword arguments:Please resolve this thread.instrument_existing_test(test_string=test_string, function_to_optimize=function_to_optimize, mode=mode, test_path=test_path)
🟡 Duplicate Code (KRRT7 comment — still present)
KRRT7 flagged three sets of duplicated logic that carry divergence risk:
-
JAR finding (3 copies — HIGH):
comparator.py:28,line_profiler.py:566,test_runner.py:64— each searches forcodeflash-runtime-1.0.0.jarin different (overlapping) paths. If a path is fixed in one copy, the others will silently miss it. -
Java executable finding (2 copies — HIGH):
comparator.py:78has comprehensive detection (JAVA_HOME, macOS Maven/Homebrew paths, stub detection).formatter.py:39has a minimal version that skips all of that. -
Package name extraction (2 copies — MEDIUM): Near-identical logic in
support.pyandtest_runner.py.
Recommend consolidating in a follow-up.
✅ Previously Fixed Bug
comparator.py:266—original_pass=Truehardcoded — fixed in commit9022f9eetooriginal_pass=scope_str != "exception".
⚡️ Optimization PRs
- ⚡️ Speed up function
_prompt_custom_directoryby 363% in PR #1199 (omni-java) #1827 (_prompt_custom_directory+363%): Merged ✅ - ⚡️ Speed up function
_get_git_remote_for_setupby 18% in PR #1199 (omni-java) #1828 (_get_git_remote_for_setup+18%): Closed — had merge conflicts.
📊 Test Coverage
Extensive test coverage under tests/test_languages/test_java/ with dedicated test files for every new module (parser, comparator, instrumentation, discovery, build tools, formatter, etc.) and an E2E test at tests/test_languages/test_java_e2e.py.
Summary
Critical bugs fixed: 3 (None concatenation crash, abstract class instantiation failure, wrong import) — merged in PR #1830. Lint: clean. Mypy: 49 remaining errors, all pre-existing. The duplicate JAR/Java-exe finding code is the main structural concern — track as a follow-up refactor.
Review by Claude.
- Fix TypeError in _build_runtime_map when test_function_name is None - Add missing abstract method stubs (find_references, extract_calling_function_source, load_coverage, setup_test_config) - Fix Language import to come from language_enum instead of base (which doesn't re-export it) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…2026-03-13T01.44.29 ⚡️ Speed up function `_prompt_custom_directory` by 363% in PR #1199 (`omni-java`)
|
This PR is now faster! 🚀 @claude[bot] accepted my optimizations from: |
fix: resolve mypy errors and None concatenation bug in JavaSupport
⚡️ Codeflash found optimizations for this PR📄 10% (0.10x) speedup for
|
⚡️ Codeflash found optimizations for this PR📄 1,032% (10.32x) speedup for
|
| """Check if name matches any include pattern.""" | ||
| if not self._include_regexes: | ||
| return True | ||
| return any(regex.match(name) for regex in self._include_regexes) |
There was a problem hiding this comment.
⚡️Codeflash found 27% (0.27x) speedup for FunctionFilterCriteria.matches_include_patterns in codeflash/languages/base.py
⏱️ Runtime : 1.06 milliseconds → 835 microseconds (best of 19 runs)
📝 Explanation and details
The original code used any(regex.match(name) for regex in self._include_regexes), which creates a generator and incurs per-iteration overhead from the any() builtin. The optimized version replaces this with an explicit for loop that returns True immediately upon the first match, short-circuiting the remaining checks. Line profiler data shows the original any() line consumed 92.9% of function time at 2309 ns per hit, while the optimized loop spreads the cost across fewer iterations (the match check now costs 367 ns per hit and early-returns bypass the rest). This yields a 26% runtime reduction with no behavioral change, as both implementations return True on the first matching regex and False otherwise.
✅ Correctness verification report:
| Test | Status |
|---|---|
| ⚙️ Existing Unit Tests | 🔘 None Found |
| 🌀 Generated Regression Tests | ✅ 251 Passed |
| ⏪ Replay Tests | 🔘 None Found |
| 🔎 Concolic Coverage Tests | ✅ 2 Passed |
| 📊 Tests Coverage | 75.0% |
🌀 Click to see Generated Regression Tests
import re
import pytest # used for our unit tests
# import the real class under test from the actual module
from codeflash.languages.base import FunctionFilterCriteria
def test_no_include_patterns_all_names_allowed():
# When no include_patterns are provided, matches_include_patterns should
# always return True for any input string (per implementation).
criteria = FunctionFilterCriteria(include_patterns=[]) # empty include list
# simple name should be allowed
assert criteria.matches_include_patterns("anything") # 517ns -> 472ns (9.53% faster)
# empty string name should also be allowed
assert criteria.matches_include_patterns("") # 222ns -> 236ns (5.93% slower)
# names with special characters still allowed when include list is empty
assert criteria.matches_include_patterns("some.name-with_special+chars()") # 168ns -> 161ns (4.35% faster)
def test_literal_pattern_matches_exact_name_only():
# A literal glob (no wildcards) should only match the exact name.
criteria = FunctionFilterCriteria(include_patterns=["exact_name"])
# exact name matches
assert criteria.matches_include_patterns("exact_name") # 2.85μs -> 1.67μs (70.5% faster)
# similar but different name does not match
assert not criteria.matches_include_patterns("exact_name_extra") # 1.12μs -> 690ns (62.9% faster)
# completely different name does not match
assert not criteria.matches_include_patterns("another") # 802ns -> 439ns (82.7% faster)
def test_wildcard_and_question_mark_patterns():
# Test glob wildcards: '*' (any sequence) and '?' (single character).
criteria = FunctionFilterCriteria(include_patterns=["foo*", "?ar"])
# 'foo*' should match strings starting with 'foo'
assert criteria.matches_include_patterns("foobar") # 2.81μs -> 1.75μs (59.8% faster)
assert criteria.matches_include_patterns("foo") # 1.07μs -> 578ns (84.9% faster)
# '?ar' should match any single-character prefix followed by 'ar'
assert criteria.matches_include_patterns("bar") # 1.43μs -> 824ns (73.7% faster)
assert not criteria.matches_include_patterns("baar") # 1.03μs -> 670ns (54.0% faster)
assert not criteria.matches_include_patterns("ar") # 844ns -> 581ns (45.3% faster)
def test_character_classes_and_negation_in_patterns():
# Character class and negation patterns should behave like fnmatch rules.
criteria = FunctionFilterCriteria(include_patterns=["file[0-9].py", "data[!0].txt"])
# 'file[0-9].py' matches file1.py but not fileA.py
assert criteria.matches_include_patterns("file1.py") # 2.69μs -> 1.73μs (54.9% faster)
assert not criteria.matches_include_patterns("fileA.py") # 1.22μs -> 830ns (46.4% faster)
# 'data[!0].txt' matches dataA.txt (A != '0') but not data0.txt
assert criteria.matches_include_patterns("dataA.txt") # 1.37μs -> 687ns (99.0% faster)
assert not criteria.matches_include_patterns("data0.txt") # 931ns -> 592ns (57.3% faster)
def test_patterns_with_literal_regex_special_chars():
# Glob patterns treat '.' as a literal dot; ensure '.' inside a pattern is not treated
# as a regex wildcard. The implementation uses fnmatch.translate so regex meta-characters
# are escaped appropriately.
criteria = FunctionFilterCriteria(include_patterns=["a.b", "c[d]e"])
# 'a.b' should match exactly 'a.b' but not 'acb'
assert criteria.matches_include_patterns("a.b") # 2.62μs -> 1.50μs (75.0% faster)
assert not criteria.matches_include_patterns("acb") # 1.27μs -> 790ns (60.4% faster)
# 'c[d]e' should match 'c[d]e' (brackets are literal in the glob) and not 'cde'
# Note: In shell-style glob, square brackets are character classes. To ensure literal
# brackets you'd normally escape them, but for the purpose of testing the translation,
# verify behavior for the given pattern string as provided.
# If the pattern is interpreted as character class, 'cde' matches since [d] == 'd'.
assert criteria.matches_include_patterns("cde") # 1.20μs -> 749ns (60.6% faster)
def test_mutating_include_patterns_after_initialization_does_not_recompile():
# __post_init__ compiles regexes at construction time. Mutating the include_patterns
# list afterwards should NOT change the already-compiled regex objects.
patterns = ["original"]
criteria = FunctionFilterCriteria(include_patterns=patterns)
# sanity: compiled regexes exist and are proper regex Pattern objects
assert hasattr(criteria, "_include_regexes")
assert all(isinstance(r, re.Pattern) for r in criteria._include_regexes) # 2.46μs -> 1.53μs (60.8% faster)
# change the original list object after construction
patterns[0] = "changed" # 1.10μs -> 617ns (77.6% faster)
# the compiled regexes should still reflect the original 'original' pattern
assert criteria.matches_include_patterns("original")
assert not criteria.matches_include_patterns("changed")
def test_pass_non_string_name_raises_type_error():
# The implementation calls regex.match(name) which expects a string-like object.
# Passing None (or an integer) should raise a TypeError from the regex engine.
criteria = FunctionFilterCriteria(include_patterns=["*"])
with pytest.raises(TypeError):
criteria.matches_include_patterns(None) # 4.37μs -> 3.38μs (29.6% faster)
with pytest.raises(TypeError):
criteria.matches_include_patterns(123) # 2.21μs -> 1.90μs (16.4% faster)
def test_pattern_without_wildcard_does_not_match_substrings():
# A pattern without '*' should not match substrings that contain the pattern.
criteria = FunctionFilterCriteria(include_patterns=["bar"])
# exact 'bar' matches
assert criteria.matches_include_patterns("bar") # 2.59μs -> 1.54μs (67.8% faster)
# 'foobar' should not match 'bar' because the glob 'bar' matches whole string only
assert not criteria.matches_include_patterns("foobar") # 1.04μs -> 623ns (67.1% faster)
# '*bar' would match 'foobar' — verify behavior differs when wildcard is present
criteria_wild = FunctionFilterCriteria(include_patterns=["*bar"])
assert criteria_wild.matches_include_patterns("foobar") # 1.69μs -> 980ns (72.4% faster)
def test_large_number_of_patterns_still_matches_correctly():
# Create many glob patterns (1000) to test scalability and correctness.
# Each pattern will be of the form 'func_<i>_*' and we verify a target name
# that should match one of them.
count = 1000
patterns = [f"func_{i}_*" for i in range(count)]
criteria = FunctionFilterCriteria(include_patterns=patterns)
# A name that should be matched by pattern index 500
assert criteria.matches_include_patterns("func_500_specialcase") # 68.8μs -> 59.8μs (15.1% faster)
# A name that doesn't match any of the generated patterns should be rejected
assert not criteria.matches_include_patterns("no_such_function_ever") # 113μs -> 106μs (6.01% faster)
# Assert that we indeed have compiled 1000 regex objects internally
assert len(criteria._include_regexes) == count
assert all(isinstance(r, re.Pattern) for r in criteria._include_regexes)
def test_many_successive_calls_remain_deterministic():
# Make many repeated calls (1000) to ensure determinism and no state corruption.
patterns = ["start*", "mid_*_end", "*finish"]
criteria = FunctionFilterCriteria(include_patterns=patterns)
# Prepare a variety of names some matching, some not
names = [
"startHere", "mid_123_end", "almostfinish", "no_match_here",
"start", "mid__end", "thefinish"
]
# Call matches_include_patterns repeatedly in a loop and confirm consistent results
results_first_pass = [criteria.matches_include_patterns(n) for n in names]
for _ in range(1000):
# subsequent passes should yield identical boolean lists
assert [criteria.matches_include_patterns(n) for n in names] == results_first_passimport pytest
from codeflash.languages.base import FunctionFilterCriteria
def test_empty_include_patterns_returns_true():
"""When include_patterns is empty, any name should match (return True)."""
criteria = FunctionFilterCriteria(include_patterns=[])
assert criteria.matches_include_patterns("any_function_name") is True # 547ns -> 477ns (14.7% faster)
assert criteria.matches_include_patterns("test") is True # 217ns -> 232ns (6.47% slower)
assert criteria.matches_include_patterns("") is True # 164ns -> 164ns (0.000% faster)
def test_single_exact_pattern_match():
"""A single exact pattern should match the exact function name."""
criteria = FunctionFilterCriteria(include_patterns=["my_function"])
assert criteria.matches_include_patterns("my_function") is True # 2.95μs -> 1.85μs (59.6% faster)
assert criteria.matches_include_patterns("other_function") is False # 964ns -> 632ns (52.5% faster)
def test_single_exact_pattern_no_match():
"""A function name that doesn't match exact pattern should return False."""
criteria = FunctionFilterCriteria(include_patterns=["my_function"])
assert criteria.matches_include_patterns("my_function_other") is False # 2.01μs -> 1.46μs (37.8% faster)
def test_wildcard_asterisk_pattern_match():
"""Glob pattern with * should match multiple character sequences."""
criteria = FunctionFilterCriteria(include_patterns=["test_*"])
assert criteria.matches_include_patterns("test_function") is True # 2.78μs -> 1.76μs (57.5% faster)
assert criteria.matches_include_patterns("test_another_name") is True # 1.06μs -> 609ns (74.7% faster)
assert criteria.matches_include_patterns("test_") is True # 1.00μs -> 429ns (134% faster)
assert criteria.matches_include_patterns("other_test_function") is False # 965ns -> 514ns (87.7% faster)
def test_wildcard_asterisk_pattern_no_match():
"""Glob pattern with * should not match unrelated names."""
criteria = FunctionFilterCriteria(include_patterns=["test_*"])
assert criteria.matches_include_patterns("function_test") is False # 1.87μs -> 1.20μs (55.7% faster)
def test_multiple_include_patterns_or_logic():
"""Multiple patterns should use OR logic - match any one pattern."""
criteria = FunctionFilterCriteria(include_patterns=["test_*", "check_*"])
assert criteria.matches_include_patterns("test_function") is True # 2.86μs -> 1.85μs (54.6% faster)
assert criteria.matches_include_patterns("check_value") is True # 1.37μs -> 944ns (44.7% faster)
assert criteria.matches_include_patterns("validate_function") is False # 1.03μs -> 620ns (66.3% faster)
def test_pattern_with_question_mark():
"""Glob pattern with ? should match exactly one character."""
criteria = FunctionFilterCriteria(include_patterns=["test_?"])
assert criteria.matches_include_patterns("test_a") is True # 2.63μs -> 1.51μs (73.6% faster)
assert criteria.matches_include_patterns("test_1") is True # 1.10μs -> 574ns (92.0% faster)
assert criteria.matches_include_patterns("test_ab") is False # 876ns -> 484ns (81.0% faster)
assert criteria.matches_include_patterns("test_") is False # 653ns -> 391ns (67.0% faster)
def test_pattern_with_character_class():
"""Glob pattern with [abc] should match any character in the class."""
criteria = FunctionFilterCriteria(include_patterns=["test_[abc]"])
assert criteria.matches_include_patterns("test_a") is True # 2.71μs -> 1.69μs (60.7% faster)
assert criteria.matches_include_patterns("test_b") is True # 1.02μs -> 560ns (81.6% faster)
assert criteria.matches_include_patterns("test_c") is True # 985ns -> 524ns (88.0% faster)
assert criteria.matches_include_patterns("test_d") is False # 874ns -> 516ns (69.4% faster)
def test_case_sensitive_matching():
"""Pattern matching should be case-sensitive."""
criteria = FunctionFilterCriteria(include_patterns=["MyFunction"])
assert criteria.matches_include_patterns("MyFunction") is True # 2.50μs -> 1.45μs (73.0% faster)
assert criteria.matches_include_patterns("myfunction") is False # 1.01μs -> 623ns (61.8% faster)
assert criteria.matches_include_patterns("MYFUNCTION") is False # 735ns -> 463ns (58.7% faster)
def test_pattern_with_underscores():
"""Patterns with underscores should match exactly."""
criteria = FunctionFilterCriteria(include_patterns=["my_test_function"])
assert criteria.matches_include_patterns("my_test_function") is True # 2.59μs -> 1.36μs (89.5% faster)
assert criteria.matches_include_patterns("my_test_function_extra") is False # 1.02μs -> 620ns (65.0% faster)
def test_pattern_with_numbers():
"""Patterns with numbers should match exactly."""
criteria = FunctionFilterCriteria(include_patterns=["function123"])
assert criteria.matches_include_patterns("function123") is True # 2.38μs -> 1.48μs (60.8% faster)
assert criteria.matches_include_patterns("function1234") is False # 955ns -> 622ns (53.5% faster)
def test_empty_string_name_with_patterns():
"""Empty string name should only match if pattern allows it."""
criteria = FunctionFilterCriteria(include_patterns=[""])
assert criteria.matches_include_patterns("") is True # 2.48μs -> 1.49μs (66.4% faster)
assert criteria.matches_include_patterns("function") is False # 970ns -> 604ns (60.6% faster)
def test_empty_string_name_with_wildcard_pattern():
"""Empty string name with wildcard pattern * should match."""
criteria = FunctionFilterCriteria(include_patterns=["*"])
assert criteria.matches_include_patterns("") is True # 2.39μs -> 1.57μs (52.0% faster)
assert criteria.matches_include_patterns("function") is True # 1.07μs -> 566ns (89.9% faster)
def test_very_long_function_name():
"""Should handle very long function names correctly."""
long_name = "a" * 1000
criteria = FunctionFilterCriteria(include_patterns=[long_name])
assert criteria.matches_include_patterns(long_name) is True # 3.64μs -> 2.48μs (46.8% faster)
assert criteria.matches_include_patterns("a" * 999) is False # 1.04μs -> 591ns (75.8% faster)
def test_very_long_pattern():
"""Should handle very long patterns correctly."""
long_pattern = "a" * 1000
criteria = FunctionFilterCriteria(include_patterns=[long_pattern])
assert criteria.matches_include_patterns("a" * 1000) is True # 3.36μs -> 2.45μs (37.1% faster)
assert criteria.matches_include_patterns("a" * 999) is False # 1.02μs -> 638ns (60.5% faster)
def test_special_glob_characters_in_name():
"""Special glob characters in pattern should be treated as glob syntax."""
criteria = FunctionFilterCriteria(include_patterns=["*test*"])
assert criteria.matches_include_patterns("mytest") is True # 2.87μs -> 1.95μs (47.3% faster)
assert criteria.matches_include_patterns("test_func") is True # 1.16μs -> 640ns (80.8% faster)
assert criteria.matches_include_patterns("my_test_func") is True # 1.07μs -> 623ns (71.3% faster)
assert criteria.matches_include_patterns("other") is False # 1.06μs -> 733ns (45.3% faster)
def test_double_asterisk_pattern():
"""Pattern with ** should match like *."""
criteria = FunctionFilterCriteria(include_patterns=["test**"])
assert criteria.matches_include_patterns("test") is True # 2.54μs -> 1.61μs (57.3% faster)
assert criteria.matches_include_patterns("testfunction") is True # 1.03μs -> 616ns (67.5% faster)
assert criteria.matches_include_patterns("test_func") is True # 1.01μs -> 542ns (87.1% faster)
def test_pattern_with_multiple_wildcards():
"""Pattern with multiple * should work correctly."""
criteria = FunctionFilterCriteria(include_patterns=["*test*func*"])
assert criteria.matches_include_patterns("prefix_test_middle_func_suffix") is True # 3.12μs -> 2.10μs (48.9% faster)
assert criteria.matches_include_patterns("test_func") is True # 1.18μs -> 691ns (70.8% faster)
assert criteria.matches_include_patterns("testfunc") is True # 1.05μs -> 597ns (75.5% faster)
assert criteria.matches_include_patterns("other") is False # 903ns -> 550ns (64.2% faster)
def test_single_character_name():
"""Single character names should be matched correctly."""
criteria = FunctionFilterCriteria(include_patterns=["a"])
assert criteria.matches_include_patterns("a") is True # 2.12μs -> 1.36μs (55.9% faster)
assert criteria.matches_include_patterns("ab") is False # 982ns -> 609ns (61.2% faster)
assert criteria.matches_include_patterns("b") is False # 732ns -> 423ns (73.0% faster)
def test_single_character_pattern():
"""Single character pattern should only match single character names."""
criteria = FunctionFilterCriteria(include_patterns=["?"])
assert criteria.matches_include_patterns("a") is True # 2.35μs -> 1.47μs (60.4% faster)
assert criteria.matches_include_patterns("1") is True # 977ns -> 454ns (115% faster)
assert criteria.matches_include_patterns("ab") is False # 815ns -> 494ns (65.0% faster)
assert criteria.matches_include_patterns("") is False # 652ns -> 392ns (66.3% faster)
def test_pattern_with_bracket_negation():
"""Bracket patterns with ! for negation."""
criteria = FunctionFilterCriteria(include_patterns=["test_[!abc]"])
assert criteria.matches_include_patterns("test_d") is True # 3.05μs -> 1.97μs (55.3% faster)
assert criteria.matches_include_patterns("test_a") is False # 989ns -> 582ns (69.9% faster)
assert criteria.matches_include_patterns("test_b") is False # 712ns -> 494ns (44.1% faster)
assert criteria.matches_include_patterns("test_c") is False # 695ns -> 397ns (75.1% faster)
def test_unicode_in_function_name():
"""Unicode characters in function names should be matched."""
criteria = FunctionFilterCriteria(include_patterns=["test_*"])
assert criteria.matches_include_patterns("test_café") is True # 2.62μs -> 1.68μs (55.9% faster)
assert criteria.matches_include_patterns("café_test") is False # 1.05μs -> 643ns (63.9% faster)
def test_unicode_in_pattern():
"""Unicode characters in patterns should work."""
criteria = FunctionFilterCriteria(include_patterns=["café"])
assert criteria.matches_include_patterns("café") is True # 2.50μs -> 1.50μs (66.7% faster)
assert criteria.matches_include_patterns("cafe") is False # 960ns -> 571ns (68.1% faster)
def test_pattern_ending_with_wildcard():
"""Pattern ending with * should match any suffix."""
criteria = FunctionFilterCriteria(include_patterns=["test*"])
assert criteria.matches_include_patterns("test") is True # 2.76μs -> 1.61μs (71.6% faster)
assert criteria.matches_include_patterns("test_function") is True # 1.14μs -> 692ns (64.6% faster)
assert criteria.matches_include_patterns("test123") is True # 926ns -> 483ns (91.7% faster)
assert criteria.matches_include_patterns("other_test") is False # 763ns -> 481ns (58.6% faster)
def test_pattern_starting_with_wildcard():
"""Pattern starting with * should match any prefix."""
criteria = FunctionFilterCriteria(include_patterns=["*test"])
assert criteria.matches_include_patterns("test") is True # 2.59μs -> 1.70μs (52.9% faster)
assert criteria.matches_include_patterns("my_test") is True # 1.03μs -> 629ns (64.5% faster)
assert criteria.matches_include_patterns("function_test") is True # 1.07μs -> 536ns (99.1% faster)
assert criteria.matches_include_patterns("test_function") is False # 995ns -> 606ns (64.2% faster)
def test_whitespace_in_function_name():
"""Whitespace in function names should be matched correctly."""
criteria = FunctionFilterCriteria(include_patterns=["my function"])
assert criteria.matches_include_patterns("my function") is True # 2.41μs -> 1.51μs (59.7% faster)
assert criteria.matches_include_patterns("myfunction") is False # 965ns -> 634ns (52.2% faster)
def test_newline_in_function_name():
"""Newline characters in function names should be handled."""
criteria = FunctionFilterCriteria(include_patterns=["my\nfunc"])
assert criteria.matches_include_patterns("my\nfunc") is True # 2.55μs -> 1.50μs (70.3% faster)
assert criteria.matches_include_patterns("myfunc") is False # 915ns -> 580ns (57.8% faster)
def test_tab_in_function_name():
"""Tab characters in function names should be handled."""
criteria = FunctionFilterCriteria(include_patterns=["my\tfunc"])
assert criteria.matches_include_patterns("my\tfunc") is True # 2.50μs -> 1.54μs (62.8% faster)
assert criteria.matches_include_patterns("myfunc") is False # 988ns -> 545ns (81.3% faster)
def test_multiple_patterns_with_overlapping_matches():
"""Overlapping patterns should still work with OR logic."""
criteria = FunctionFilterCriteria(include_patterns=["test_*", "test_func*"])
assert criteria.matches_include_patterns("test_function") is True # 2.69μs -> 1.69μs (59.5% faster)
assert criteria.matches_include_patterns("test_") is True # 1.08μs -> 620ns (74.4% faster)
# Both patterns would match this, but only one needs to
assert criteria.matches_include_patterns("test_func_extended") is True # 944ns -> 465ns (103% faster)
def test_no_patterns_explicit_empty_list():
"""Explicitly empty include_patterns should match everything."""
criteria = FunctionFilterCriteria(include_patterns=[])
assert criteria.matches_include_patterns("anything") is True # 490ns -> 456ns (7.46% faster)
assert criteria.matches_include_patterns("_") is True # 203ns -> 204ns (0.490% slower)
assert criteria.matches_include_patterns("") is True # 174ns -> 155ns (12.3% faster)
def test_pattern_with_escaped_asterisk():
"""Glob patterns follow fnmatch rules - * is wildcard in fnmatch."""
# Note: fnmatch doesn't support escaping, so [*] matches literal *
criteria = FunctionFilterCriteria(include_patterns=["[*]"])
assert criteria.matches_include_patterns("*") is True # 2.78μs -> 1.64μs (69.6% faster)
assert criteria.matches_include_patterns("a") is False # 1.04μs -> 600ns (73.0% faster)
def test_repeated_pattern_in_list():
"""Duplicate patterns in list should work without issues."""
criteria = FunctionFilterCriteria(include_patterns=["test", "test"])
assert criteria.matches_include_patterns("test") is True # 2.47μs -> 1.46μs (69.7% faster)
assert criteria.matches_include_patterns("other") is False # 1.27μs -> 759ns (67.6% faster)
def test_pattern_with_dots():
"""Dots in pattern should match literally."""
criteria = FunctionFilterCriteria(include_patterns=["test.function"])
assert criteria.matches_include_patterns("test.function") is True # 2.45μs -> 1.47μs (67.1% faster)
assert criteria.matches_include_patterns("testfunction") is False # 985ns -> 607ns (62.3% faster)
assert criteria.matches_include_patterns("test_function") is False # 768ns -> 427ns (79.9% faster)
def test_pattern_with_hyphens():
"""Hyphens in pattern should match literally."""
criteria = FunctionFilterCriteria(include_patterns=["test-function"])
assert criteria.matches_include_patterns("test-function") is True # 2.50μs -> 1.49μs (67.6% faster)
assert criteria.matches_include_patterns("test_function") is False # 945ns -> 542ns (74.4% faster)
def test_single_pattern_with_multiple_wildcards_complex():
"""Complex pattern with alternating wildcards and literals."""
criteria = FunctionFilterCriteria(include_patterns=["*_test_*_func_*"])
assert criteria.matches_include_patterns("prefix_test_middle_func_suffix") is True # 3.20μs -> 2.20μs (45.7% faster)
assert criteria.matches_include_patterns("a_test_b_func_c") is True # 1.18μs -> 724ns (62.7% faster)
assert criteria.matches_include_patterns("test_func") is False # 914ns -> 492ns (85.8% faster)
def test_pattern_matching_is_anchored_at_start():
"""fnmatch.translate anchors patterns at start and end by default."""
criteria = FunctionFilterCriteria(include_patterns=["test"])
assert criteria.matches_include_patterns("test") is True # 2.38μs -> 1.50μs (59.2% faster)
assert criteria.matches_include_patterns("test_extra") is False # 987ns -> 679ns (45.4% faster)
assert criteria.matches_include_patterns("prefix_test") is False # 670ns -> 452ns (48.2% faster)
def test_many_patterns_with_matching_name():
"""Performance test with many patterns - one matching."""
# Create 100 patterns where only one matches
patterns = [f"func_{i}" for i in range(100)]
criteria = FunctionFilterCriteria(include_patterns=patterns)
# This should eventually find the match
assert criteria.matches_include_patterns("func_50") is True # 9.31μs -> 7.48μs (24.4% faster)
def test_many_patterns_no_match():
"""Performance test with many patterns - none matching."""
# Create 100 patterns, test with name that doesn't match any
patterns = [f"func_{i}" for i in range(100)]
criteria = FunctionFilterCriteria(include_patterns=patterns)
assert criteria.matches_include_patterns("other_func") is False # 12.9μs -> 10.9μs (18.7% faster)
def test_many_patterns_all_wildcards():
"""Performance test with many wildcard patterns."""
# Create 100 patterns with wildcards
patterns = [f"test_*_{i}" for i in range(100)]
criteria = FunctionFilterCriteria(include_patterns=patterns)
assert criteria.matches_include_patterns("test_middle_50") is True # 11.0μs -> 8.76μs (25.6% faster)
def test_large_function_name_list_matching():
"""Test matching against a single complex pattern with long name."""
# Very long name with repetition
long_name = "test_" + ("a_" * 200) + "function"
criteria = FunctionFilterCriteria(include_patterns=["test_*"])
assert criteria.matches_include_patterns(long_name) is True # 2.65μs -> 1.83μs (44.7% faster)
def test_large_function_name_list_no_match():
"""Test non-matching against a single complex pattern with long name."""
long_name = "other_" + ("b_" * 200) + "function"
criteria = FunctionFilterCriteria(include_patterns=["test_*"])
assert criteria.matches_include_patterns(long_name) is False # 1.92μs -> 1.29μs (49.5% faster)
def test_1000_exact_patterns_with_first_match():
"""Stress test: 1000 exact patterns, match is first."""
patterns = [f"function_{i}" for i in range(1000)]
criteria = FunctionFilterCriteria(include_patterns=patterns)
assert criteria.matches_include_patterns("function_0") is True # 4.13μs -> 2.14μs (92.8% faster)
def test_1000_exact_patterns_with_middle_match():
"""Stress test: 1000 exact patterns, match is in middle."""
patterns = [f"function_{i}" for i in range(1000)]
criteria = FunctionFilterCriteria(include_patterns=patterns)
assert criteria.matches_include_patterns("function_500") is True # 69.4μs -> 59.1μs (17.4% faster)
def test_1000_exact_patterns_with_last_match():
"""Stress test: 1000 exact patterns, match is last."""
patterns = [f"function_{i}" for i in range(1000)]
criteria = FunctionFilterCriteria(include_patterns=patterns)
assert criteria.matches_include_patterns("function_999") is True # 132μs -> 112μs (17.5% faster)
def test_1000_exact_patterns_with_no_match():
"""Stress test: 1000 exact patterns, no match."""
patterns = [f"function_{i}" for i in range(1000)]
criteria = FunctionFilterCriteria(include_patterns=patterns)
assert criteria.matches_include_patterns("function_9999") is False # 130μs -> 115μs (12.9% faster)
def test_1000_wildcard_patterns_all_matching():
"""Stress test: 1000 wildcard patterns, all would match."""
patterns = [f"test_*_{i}" for i in range(1000)]
criteria = FunctionFilterCriteria(include_patterns=patterns)
assert criteria.matches_include_patterns("test_name_500") is True # 80.2μs -> 70.0μs (14.7% faster)
def test_1000_wildcard_patterns_first_match():
"""Stress test: 1000 wildcard patterns, first matches."""
patterns = [f"test_*_{i}" for i in range(1000)]
criteria = FunctionFilterCriteria(include_patterns=patterns)
assert criteria.matches_include_patterns("test_something_0") is True # 4.32μs -> 2.39μs (80.5% faster)
def test_many_question_mark_patterns():
"""Stress test: patterns with many ? characters."""
patterns = ["test_" + "?" * i for i in range(1, 100, 10)]
criteria = FunctionFilterCriteria(include_patterns=patterns)
assert criteria.matches_include_patterns("test_a") is True # 3.26μs -> 1.97μs (65.1% faster)
assert criteria.matches_include_patterns("test_abc") is True # 2.55μs -> 1.94μs (31.8% faster)
assert criteria.matches_include_patterns("test_") is False # 1.99μs -> 1.51μs (31.3% faster)
def test_alternating_pattern_types():
"""Stress test: mix of exact, wildcard, and question mark patterns."""
patterns = []
for i in range(100):
if i % 3 == 0:
patterns.append(f"exact_{i}")
elif i % 3 == 1:
patterns.append(f"wild_*_{i}")
else:
patterns.append(f"question_?_{i}")
criteria = FunctionFilterCriteria(include_patterns=patterns)
assert criteria.matches_include_patterns("exact_0") is True # 3.34μs -> 1.93μs (73.3% faster)
assert criteria.matches_include_patterns("wild_something_1") is True # 1.77μs -> 1.21μs (46.6% faster)
assert criteria.matches_include_patterns("question_x_2") is True # 1.39μs -> 875ns (58.4% faster)
assert criteria.matches_include_patterns("nomatch") is False # 12.8μs -> 11.0μs (15.9% faster)
def test_deeply_nested_brackets_pattern():
"""Test pattern with complex bracket expressions."""
patterns = ["[a-zA-Z0-9]*_test_*"]
criteria = FunctionFilterCriteria(include_patterns=patterns)
assert criteria.matches_include_patterns("abc123_test_function") is True # 3.40μs -> 2.21μs (53.7% faster)
assert criteria.matches_include_patterns("_test_function") is False # 1.01μs -> 644ns (56.5% faster)
def test_all_ascii_letters_in_patterns():
"""Test with patterns using all ASCII letters."""
patterns = ["".join(chr(i) for i in range(97, 123))] # a-z
criteria = FunctionFilterCriteria(include_patterns=patterns)
assert criteria.matches_include_patterns("abcdefghijklmnopqrstuvwxyz") is True # 2.78μs -> 1.63μs (70.0% faster)
assert criteria.matches_include_patterns("ABCDEFGHIJKLMNOPQRSTUVWXYZ") is False # 991ns -> 613ns (61.7% faster)
def test_performance_with_many_similar_patterns():
"""Stress test: many similar patterns that all start the same way."""
patterns = [f"test_similar_name_{i}" for i in range(500)]
criteria = FunctionFilterCriteria(include_patterns=patterns)
assert criteria.matches_include_patterns("test_similar_name_250") is True # 39.3μs -> 31.7μs (24.1% faster)
assert criteria.matches_include_patterns("test_similar_name_9999") is False # 61.5μs -> 55.1μs (11.6% faster)
def test_regex_compilation_caching():
"""Verify that regexes are compiled once and reused."""
# Create criteria with patterns
criteria = FunctionFilterCriteria(include_patterns=["test_*", "func_*"])
# Call matches_include_patterns multiple times
# This should use cached compiled regexes
for _ in range(100):
criteria.matches_include_patterns("test_function") # 88.0μs -> 43.7μs (101% faster)
# If this completes without error, caching worked
assert True
def test_post_init_called_automatically():
"""Verify __post_init__ is called and regexes are compiled."""
criteria = FunctionFilterCriteria(include_patterns=["test_*"])
# The _include_regexes should exist and have one entry
assert hasattr(criteria, "_include_regexes") # 2.78μs -> 1.88μs (47.8% faster)
assert len(criteria._include_regexes) == 1
assert criteria.matches_include_patterns("test_function") is Truefrom codeflash.languages.base import FunctionFilterCriteria
def test_FunctionFilterCriteria_matches_include_patterns():
FunctionFilterCriteria.matches_include_patterns(FunctionFilterCriteria(include_patterns=['?'], exclude_patterns=[], require_return=False, require_export=True, include_async=False, include_methods=False, min_lines=0, max_lines=0), '')
def test_FunctionFilterCriteria_matches_include_patterns_2():
FunctionFilterCriteria.matches_include_patterns(FunctionFilterCriteria(include_patterns=[], exclude_patterns=[], require_return=False, require_export=False, include_async=False, include_methods=False, min_lines=0, max_lines=0), '')🔎 Click to see Concolic Coverage Tests
To test or edit this optimization locally git merge codeflash/optimize-pr1199-2026-03-13T03.57.10
| return any(regex.match(name) for regex in self._include_regexes) | |
| for regex in self._include_regexes: | |
| if regex.match(name): | |
| return True | |
| return False |
| if not self._exclude_regexes: | ||
| return False | ||
| return any(regex.match(name) for regex in self._exclude_regexes) |
There was a problem hiding this comment.
⚡️Codeflash found 47% (0.47x) speedup for FunctionFilterCriteria.matches_exclude_patterns in codeflash/languages/base.py
⏱️ Runtime : 10.1 milliseconds → 6.87 milliseconds (best of 52 runs)
📝 Explanation and details
The optimization replaces any(regex.match(name) for regex in self._exclude_regexes) with an explicit for loop that returns True immediately upon finding the first match, eliminating generator overhead and short-circuiting more efficiently. The original approach materialized the generator expression for each call, costing ~3,284 ns per hit, whereas the loop-based early exit reduces per-hit cost to ~333 ns (10× improvement). Profiler data confirms the bottleneck: the any() line consumed 95% of original runtime, now replaced by a loop accounting for only 70% of the reduced total. This pattern is called thousands of times during Java method discovery (via _should_include_method), so the 47% overall speedup compounds across large codebases.
✅ Correctness verification report:
| Test | Status |
|---|---|
| ⚙️ Existing Unit Tests | 🔘 None Found |
| 🌀 Generated Regression Tests | ✅ 6498 Passed |
| ⏪ Replay Tests | 🔘 None Found |
| 🔎 Concolic Coverage Tests | ✅ 2 Passed |
| 📊 Tests Coverage | 75.0% |
🌀 Click to see Generated Regression Tests
import pytest # used for our unit tests
from codeflash.languages.base import FunctionFilterCriteria
def test_no_exclude_patterns_returns_false():
# Create a criteria object with default settings (no exclude patterns).
criteria = FunctionFilterCriteria()
# With no compiled exclude regexes, matching any name should return False.
assert criteria.matches_exclude_patterns("anything") is False # 521ns -> 479ns (8.77% faster)
# An empty string should also not match when there are no exclude patterns.
assert criteria.matches_exclude_patterns("") is False # 232ns -> 243ns (4.53% slower)
def test_exact_pattern_matching():
# Exclude a literal name "foo".
criteria = FunctionFilterCriteria(exclude_patterns=["foo"])
# Exact name should match.
assert criteria.matches_exclude_patterns("foo") is True # 2.69μs -> 1.45μs (86.2% faster)
# A longer name that merely contains "foo" should not match an exact pattern.
assert criteria.matches_exclude_patterns("foobar") is False # 1.09μs -> 616ns (76.3% faster)
# A different name should not match.
assert criteria.matches_exclude_patterns("bar") is False # 675ns -> 430ns (57.0% faster)
def test_glob_star_matches_all_and_empty_name():
# Use the glob pattern '*' which should match any string, including empty.
criteria = FunctionFilterCriteria(exclude_patterns=["*"])
# Arbitrary string should match.
assert criteria.matches_exclude_patterns("anything-at-all") is True # 2.48μs -> 1.34μs (85.7% faster)
# Empty string should also match '*' in fnmatch semantics.
assert criteria.matches_exclude_patterns("") is True # 1.12μs -> 535ns (109% faster)
def test_question_mark_and_bracket_wildcards():
# Use '?' to match exactly one character and bracket expression for digits.
criteria = FunctionFilterCriteria(exclude_patterns=["file?.py", "data[0-9].csv"])
# 'file1.py' has exactly one char between 'file' and '.py' -> match.
assert criteria.matches_exclude_patterns("file1.py") is True # 2.62μs -> 1.54μs (70.0% faster)
# 'file12.py' has two chars -> should not match 'file?.py'.
assert criteria.matches_exclude_patterns("file12.py") is False # 1.23μs -> 766ns (59.9% faster)
# 'data7.csv' matches the bracket expression [0-9].
assert criteria.matches_exclude_patterns("data7.csv") is True # 1.32μs -> 739ns (78.5% faster)
# 'data10.csv' has two digits -> should not match '[0-9]'.
assert criteria.matches_exclude_patterns("data10.csv") is False # 880ns -> 490ns (79.6% faster)
def test_literal_special_characters_treated_as_glob_literals():
# Characters like '.' and '+' are not special in glob syntax the same way as regex,
# so patterns should treat them as literal characters unless glob meta-characters are used.
criteria = FunctionFilterCriteria(exclude_patterns=["a.b", "c+d"])
# Should match the literal strings containing '.' and '+' respectively.
assert criteria.matches_exclude_patterns("a.b") is True # 2.35μs -> 1.37μs (71.4% faster)
assert criteria.matches_exclude_patterns("c+d") is True # 1.27μs -> 707ns (80.3% faster)
# Similar strings without the literal chars should not match.
assert criteria.matches_exclude_patterns("aXb") is False # 947ns -> 599ns (58.1% faster)
assert criteria.matches_exclude_patterns("cXb") is False # 853ns -> 524ns (62.8% faster)
def test_none_name_raises_type_error():
# The function expects a string; passing None should raise a TypeError from re.match.
criteria = FunctionFilterCriteria(exclude_patterns=["*"])
with pytest.raises(TypeError):
# Attempting to match None should raise because regex.match expects a string/bytes-like object.
criteria.matches_exclude_patterns(None) # 4.37μs -> 3.41μs (28.3% faster)
def test_changing_exclude_patterns_after_init_has_no_effect():
# Demonstrate that exclude_patterns are compiled in __post_init__ and changing the list
# afterward does not update the precompiled regexes.
criteria = FunctionFilterCriteria(exclude_patterns=[])
# Initially there are no exclude regexes, so no name matches.
assert criteria.matches_exclude_patterns("foo") is False # 496ns -> 443ns (12.0% faster)
# Mutate the public list after construction.
criteria.exclude_patterns.append("foo")
# Because _exclude_regexes were compiled at __post_init__, the new pattern is not compiled,
# so matching should still return False.
assert criteria.matches_exclude_patterns("foo") is False # 230ns -> 256ns (10.2% slower)
# If we explicitly update the private compiled regexes (simulating reinitialization),
# behavior will change — demonstrate the intended compiled-state behavior.
criteria._exclude_regexes = [__import__("re").compile(__import__("fnmatch").translate("foo"))]
assert criteria.matches_exclude_patterns("foo") is True # 2.36μs -> 1.28μs (85.3% faster)
def test_many_patterns_one_match_large_scale():
# Create a large list of exclude glob patterns (1000 patterns).
patterns = [f"prefix{i}*" for i in range(1000)]
# Instantiate the criteria which will compile all patterns.
criteria = FunctionFilterCriteria(exclude_patterns=patterns)
# A name that matches the last pattern should be found (tests scalability).
assert criteria.matches_exclude_patterns("prefix999_suffix") is True # 6.38μs -> 3.89μs (64.0% faster)
# A name that matches none of the patterns should not be excluded.
assert criteria.matches_exclude_patterns("no_prefix_here") is False # 121μs -> 106μs (14.7% faster)
def test_many_names_against_single_pattern_performance_and_correctness():
# Use a single exclude pattern and test it against 1000 different names.
criteria = FunctionFilterCriteria(exclude_patterns=["matchme*"])
matches = 0
# Generate 1000 names and count how many match the single pattern.
for i in range(1000):
name = f"matchme{i}" if i % 2 == 0 else f"nomatch{i}"
if criteria.matches_exclude_patterns(name):
matches += 1
# Exactly the even-indexed names should match (500 matches out of 1000).
assert matches == 500
def test_repeated_calls_idempotent_under_load():
# Ensure many repeated calls produce consistent results (idempotency / no stateful mutation).
criteria = FunctionFilterCriteria(exclude_patterns=["x*"])
# Call the method 1000 times and ensure it consistently returns True for a matching name.
for _ in range(1000):
assert criteria.matches_exclude_patterns("x123") is True # 837μs -> 388μs (116% faster)
# And consistently False for a non-matching name.
for _ in range(1000):
assert criteria.matches_exclude_patterns("y123") is False # 623μs -> 337μs (84.4% faster)import pytest
from codeflash.languages.base import FunctionFilterCriteria
class TestBasicFunctionality:
"""Test basic matching behavior with common use cases."""
def test_no_exclude_patterns_returns_false(self):
"""When exclude_patterns is empty, should always return False."""
criteria = FunctionFilterCriteria(exclude_patterns=[])
assert criteria.matches_exclude_patterns("test_function") is False # 510ns -> 459ns (11.1% faster)
assert criteria.matches_exclude_patterns("any_name") is False # 220ns -> 229ns (3.93% slower)
assert criteria.matches_exclude_patterns("") is False # 166ns -> 166ns (0.000% faster)
def test_exact_match_single_pattern(self):
"""Test exact string matching with a single exclude pattern."""
criteria = FunctionFilterCriteria(exclude_patterns=["test_function"])
assert criteria.matches_exclude_patterns("test_function") is True # 2.91μs -> 1.68μs (73.2% faster)
assert criteria.matches_exclude_patterns("other_function") is False # 948ns -> 546ns (73.6% faster)
def test_multiple_exclude_patterns_one_matches(self):
"""Test that function returns True if any pattern matches."""
criteria = FunctionFilterCriteria(exclude_patterns=["foo", "bar", "baz"])
assert criteria.matches_exclude_patterns("foo") is True # 2.67μs -> 1.51μs (76.6% faster)
assert criteria.matches_exclude_patterns("bar") is True # 1.30μs -> 756ns (71.8% faster)
assert criteria.matches_exclude_patterns("baz") is True # 1.22μs -> 736ns (66.3% faster)
assert criteria.matches_exclude_patterns("qux") is False # 1.14μs -> 663ns (71.8% faster)
def test_glob_pattern_asterisk_prefix(self):
"""Test glob pattern with asterisk prefix (matches suffix)."""
criteria = FunctionFilterCriteria(exclude_patterns=["*_test"])
assert criteria.matches_exclude_patterns("my_test") is True # 2.74μs -> 1.53μs (79.5% faster)
assert criteria.matches_exclude_patterns("function_test") is True # 1.03μs -> 545ns (88.4% faster)
assert criteria.matches_exclude_patterns("test") is False # 884ns -> 464ns (90.5% faster)
assert criteria.matches_exclude_patterns("_test_function") is False # 831ns -> 527ns (57.7% faster)
def test_glob_pattern_asterisk_suffix(self):
"""Test glob pattern with asterisk suffix (matches prefix)."""
criteria = FunctionFilterCriteria(exclude_patterns=["test_*"])
assert criteria.matches_exclude_patterns("test_foo") is True # 2.67μs -> 1.37μs (94.5% faster)
assert criteria.matches_exclude_patterns("test_bar") is True # 976ns -> 489ns (99.6% faster)
assert criteria.matches_exclude_patterns("test_") is True # 894ns -> 430ns (108% faster)
assert criteria.matches_exclude_patterns("mytest_foo") is False # 841ns -> 439ns (91.6% faster)
def test_glob_pattern_asterisk_both_sides(self):
"""Test glob pattern with asterisks on both sides."""
criteria = FunctionFilterCriteria(exclude_patterns=["*test*"])
assert criteria.matches_exclude_patterns("test") is True # 2.72μs -> 1.68μs (62.5% faster)
assert criteria.matches_exclude_patterns("my_test_func") is True # 1.17μs -> 623ns (87.6% faster)
assert criteria.matches_exclude_patterns("testcase") is True # 901ns -> 442ns (104% faster)
assert criteria.matches_exclude_patterns("function") is False # 1.10μs -> 772ns (42.7% faster)
def test_glob_pattern_question_mark(self):
"""Test glob pattern with question mark (matches single char)."""
criteria = FunctionFilterCriteria(exclude_patterns=["test?"])
assert criteria.matches_exclude_patterns("test1") is True # 2.44μs -> 1.35μs (80.2% faster)
assert criteria.matches_exclude_patterns("testA") is True # 964ns -> 485ns (98.8% faster)
assert criteria.matches_exclude_patterns("test") is False # 850ns -> 464ns (83.2% faster)
assert criteria.matches_exclude_patterns("test12") is False # 725ns -> 359ns (102% faster)
def test_glob_pattern_character_class(self):
"""Test glob pattern with character class."""
criteria = FunctionFilterCriteria(exclude_patterns=["test[123]"])
assert criteria.matches_exclude_patterns("test1") is True # 2.62μs -> 1.54μs (70.0% faster)
assert criteria.matches_exclude_patterns("test2") is True # 967ns -> 461ns (110% faster)
assert criteria.matches_exclude_patterns("test3") is True # 897ns -> 372ns (141% faster)
assert criteria.matches_exclude_patterns("test4") is False # 826ns -> 460ns (79.6% faster)
assert criteria.matches_exclude_patterns("testa") is False # 668ns -> 347ns (92.5% faster)
def test_multiple_patterns_mixed_matching(self):
"""Test with multiple patterns where different ones match."""
criteria = FunctionFilterCriteria(exclude_patterns=["*_internal", "test_*", "debug*"])
assert criteria.matches_exclude_patterns("helper_internal") is True # 2.71μs -> 1.64μs (64.7% faster)
assert criteria.matches_exclude_patterns("test_case") is True # 1.48μs -> 1.01μs (47.2% faster)
assert criteria.matches_exclude_patterns("debug_mode") is True # 1.38μs -> 895ns (54.0% faster)
assert criteria.matches_exclude_patterns("public_function") is False # 1.19μs -> 742ns (60.2% faster)
class TestEdgeCases:
"""Test behavior with edge cases and boundary conditions."""
def test_empty_string_name(self):
"""Test matching empty string against patterns."""
criteria = FunctionFilterCriteria(exclude_patterns=[""])
assert criteria.matches_exclude_patterns("") is True # 2.31μs -> 1.31μs (75.9% faster)
assert criteria.matches_exclude_patterns("any") is False # 927ns -> 487ns (90.3% faster)
def test_empty_string_pattern(self):
"""Test empty string as exclude pattern."""
criteria = FunctionFilterCriteria(exclude_patterns=[""])
assert criteria.matches_exclude_patterns("") is True # 2.13μs -> 1.30μs (64.3% faster)
# Empty pattern should not match non-empty strings
assert criteria.matches_exclude_patterns("a") is False # 857ns -> 547ns (56.7% faster)
def test_special_characters_in_name(self):
"""Test function names with special characters."""
criteria = FunctionFilterCriteria(exclude_patterns=["test_*"])
assert criteria.matches_exclude_patterns("test_@func") is True # 2.42μs -> 1.47μs (65.2% faster)
assert criteria.matches_exclude_patterns("test_#name") is True # 1.08μs -> 549ns (96.4% faster)
def test_underscore_pattern(self):
"""Test patterns with underscores."""
criteria = FunctionFilterCriteria(exclude_patterns=["_*"])
assert criteria.matches_exclude_patterns("_private") is True # 2.44μs -> 1.48μs (64.9% faster)
assert criteria.matches_exclude_patterns("__dunder__") is True # 926ns -> 484ns (91.3% faster)
assert criteria.matches_exclude_patterns("public") is False # 862ns -> 493ns (74.8% faster)
def test_dunder_names(self):
"""Test Python dunder method names."""
criteria = FunctionFilterCriteria(exclude_patterns=["__*__"])
assert criteria.matches_exclude_patterns("__init__") is True # 2.53μs -> 1.57μs (60.8% faster)
assert criteria.matches_exclude_patterns("__str__") is True # 985ns -> 514ns (91.6% faster)
assert criteria.matches_exclude_patterns("_private") is False # 785ns -> 458ns (71.4% faster)
def test_very_long_function_name(self):
"""Test with very long function name."""
long_name = "a" * 1000
criteria = FunctionFilterCriteria(exclude_patterns=["a*"])
assert criteria.matches_exclude_patterns(long_name) is True # 2.50μs -> 1.42μs (76.3% faster)
def test_very_long_pattern(self):
"""Test with very long exclusion pattern."""
long_pattern = "test_" + "x" * 1000
criteria = FunctionFilterCriteria(exclude_patterns=[long_pattern])
assert criteria.matches_exclude_patterns(long_pattern) is True # 3.76μs -> 2.48μs (51.7% faster)
assert criteria.matches_exclude_patterns("test_" + "x" * 999) is False # 1.02μs -> 524ns (95.0% faster)
def test_pattern_with_dot(self):
"""Test patterns containing dots."""
criteria = FunctionFilterCriteria(exclude_patterns=["*.test"])
# Dots in fnmatch are literal, not regex wildcards
assert criteria.matches_exclude_patterns("something.test") is True # 2.75μs -> 1.64μs (68.3% faster)
assert criteria.matches_exclude_patterns("somethingtest") is False # 1.02μs -> 581ns (75.4% faster)
def test_case_sensitivity(self):
"""Test that matching is case-sensitive."""
criteria = FunctionFilterCriteria(exclude_patterns=["TestFunction"])
assert criteria.matches_exclude_patterns("TestFunction") is True # 2.40μs -> 1.35μs (77.3% faster)
assert criteria.matches_exclude_patterns("testfunction") is False # 958ns -> 550ns (74.2% faster)
assert criteria.matches_exclude_patterns("TESTFUNCTION") is False # 711ns -> 380ns (87.1% faster)
def test_pattern_with_brackets(self):
"""Test patterns with square brackets."""
criteria = FunctionFilterCriteria(exclude_patterns=["func[0-9]"])
assert criteria.matches_exclude_patterns("func1") is True # 2.61μs -> 1.56μs (67.4% faster)
assert criteria.matches_exclude_patterns("func9") is True # 918ns -> 419ns (119% faster)
assert criteria.matches_exclude_patterns("funca") is False # 830ns -> 451ns (84.0% faster)
def test_single_asterisk_pattern(self):
"""Test single asterisk as pattern (matches any string)."""
criteria = FunctionFilterCriteria(exclude_patterns=["*"])
assert criteria.matches_exclude_patterns("anything") is True # 2.44μs -> 1.44μs (69.1% faster)
assert criteria.matches_exclude_patterns("") is True # 993ns -> 503ns (97.4% faster)
assert criteria.matches_exclude_patterns("123") is True # 832ns -> 414ns (101% faster)
def test_pattern_with_hyphen(self):
"""Test patterns with hyphens."""
criteria = FunctionFilterCriteria(exclude_patterns=["my-function*"])
assert criteria.matches_exclude_patterns("my-function-test") is True # 2.97μs -> 1.69μs (75.7% faster)
assert criteria.matches_exclude_patterns("my-function") is True # 1.03μs -> 536ns (93.1% faster)
assert criteria.matches_exclude_patterns("myfunction") is False # 911ns -> 482ns (89.0% faster)
def test_many_exclude_patterns(self):
"""Test with many exclude patterns (100+)."""
patterns = [f"pattern_{i}" for i in range(150)]
criteria = FunctionFilterCriteria(exclude_patterns=patterns)
assert criteria.matches_exclude_patterns("pattern_0") is True # 3.37μs -> 1.78μs (89.4% faster)
assert criteria.matches_exclude_patterns("pattern_75") is True # 11.7μs -> 10.1μs (15.4% faster)
assert criteria.matches_exclude_patterns("pattern_149") is True # 20.6μs -> 18.2μs (13.3% faster)
assert criteria.matches_exclude_patterns("pattern_150") is False # 20.2μs -> 18.1μs (11.8% faster)
assert criteria.matches_exclude_patterns("other") is False # 18.4μs -> 16.2μs (13.9% faster)
def test_overlapping_patterns(self):
"""Test with overlapping/redundant patterns."""
criteria = FunctionFilterCriteria(exclude_patterns=["test*", "test_*", "test_func*"])
assert criteria.matches_exclude_patterns("test_function") is True # 2.75μs -> 1.77μs (55.6% faster)
assert criteria.matches_exclude_patterns("test") is True # 978ns -> 493ns (98.4% faster)
def test_pattern_with_escaped_characters(self):
"""Test patterns that might have escaped special chars."""
# fnmatch.translate will handle these appropriately
criteria = FunctionFilterCriteria(exclude_patterns=["test\\*"])
# In fnmatch, backslash is not an escape character, so this is literal match
assert criteria.matches_exclude_patterns("test\\*") is True # 2.64μs -> 1.49μs (77.5% faster)
class TestLargeScale:
"""Test performance with large datasets and many patterns."""
def test_many_patterns_many_names(self):
"""Test matching many names against many patterns."""
# Create 200 patterns
patterns = [f"exclude_{i}" for i in range(200)]
criteria = FunctionFilterCriteria(exclude_patterns=patterns)
# Test many names, some matching
for i in range(200):
assert criteria.matches_exclude_patterns(f"exclude_{i}") is True # 2.75ms -> 2.35ms (17.0% faster)
# Test names that don't match
for i in range(200, 250):
assert criteria.matches_exclude_patterns(f"include_{i}") is False # 1.21ms -> 1.05ms (14.9% faster)
def test_wildcard_patterns_performance(self):
"""Test performance with wildcard patterns and many function names."""
patterns = ["exclude_*", "test_*", "debug_*", "_*"]
criteria = FunctionFilterCriteria(exclude_patterns=patterns)
# Test many matching names
for i in range(1000):
assert criteria.matches_exclude_patterns(f"exclude_{i}") is True # 830μs -> 394μs (111% faster)
for i in range(1000):
assert criteria.matches_exclude_patterns(f"test_{i}") is True # 988μs -> 532μs (85.6% faster)
def test_complex_glob_patterns_performance(self):
"""Test performance with complex glob patterns."""
patterns = [
"*_test", "test_*", "*_internal", "_*",
"debug*", "*debug*", "deprecated*",
"temp_*", "*_deprecated", "unused_*"
]
criteria = FunctionFilterCriteria(exclude_patterns=patterns)
# Test 500 names against 10 complex patterns
for i in range(500):
if i % 2 == 0:
assert criteria.matches_exclude_patterns(f"test_func_{i}") is True
else:
assert criteria.matches_exclude_patterns(f"real_func_{i}") is False
def test_many_patterns_with_different_prefixes(self):
"""Test with many patterns using different prefixes."""
patterns = [f"prefix_{chr(65 + i % 26)}_*" for i in range(100)]
criteria = FunctionFilterCriteria(exclude_patterns=patterns)
# Test matching patterns
for i in range(100):
char = chr(65 + i % 26)
assert criteria.matches_exclude_patterns(f"prefix_{char}_func_{i}") is True # 258μs -> 192μs (34.7% faster)
# Test non-matching
assert criteria.matches_exclude_patterns("nomatch_func") is False # 12.8μs -> 10.6μs (20.0% faster)
def test_nested_glob_patterns_performance(self):
"""Test with deeply nested glob patterns."""
patterns = ["a*", "*b", "a*b", "*a*b*", "a?b*", "*a?b*"]
criteria = FunctionFilterCriteria(exclude_patterns=patterns)
# Test 300 variations
for i in range(300):
result = criteria.matches_exclude_patterns(f"a_value_b_{i}") # 249μs -> 118μs (111% faster)
# Should match due to "a*b" pattern
assert result is True
def test_all_single_char_patterns(self):
"""Test with patterns for all single characters."""
# Create patterns for each letter and digit
patterns = list("abcdefghijklmnopqrstuvwxyz") + list("0123456789")
criteria = FunctionFilterCriteria(exclude_patterns=patterns)
# Each single-char name should match
for char in patterns:
assert criteria.matches_exclude_patterns(char) is True # 113μs -> 84.0μs (35.0% faster)
# Multi-char names starting with those chars won't match (exact match)
for char in patterns[:10]:
assert criteria.matches_exclude_patterns(char + "extra") is False # 50.1μs -> 41.3μs (21.4% faster)
def test_wildcard_only_patterns_many_names(self):
"""Test single wildcard pattern against many names."""
criteria = FunctionFilterCriteria(exclude_patterns=["*"])
# All names should match single wildcard
for i in range(1000):
assert criteria.matches_exclude_patterns(f"func_{i}") is True # 820μs -> 386μs (112% faster)
def test_incremental_pattern_matching(self):
"""Test that pattern matching remains consistent across many calls."""
patterns = ["test_*", "debug_*", "*_internal", "_*"]
criteria = FunctionFilterCriteria(exclude_patterns=patterns)
test_names = [
"test_function", "debug_mode", "helper_internal", "_private",
"public_function", "my_function", "test_debug_case"
]
# Run matching 100 times to ensure consistency
for _ in range(100):
assert criteria.matches_exclude_patterns("test_function") is True # 86.0μs -> 41.3μs (108% faster)
assert criteria.matches_exclude_patterns("public_function") is False
class TestIntegration:
"""Test integration with dataclass features and initialization."""
def test_post_init_compiles_regexes(self):
"""Verify that __post_init__ properly compiles regex patterns."""
patterns = ["test_*", "*_internal"]
criteria = FunctionFilterCriteria(exclude_patterns=patterns)
# After post_init, _exclude_regexes should be populated
assert len(criteria._exclude_regexes) == 2 # 2.53μs -> 1.47μs (72.3% faster)
assert criteria.matches_exclude_patterns("test_func") is True
def test_dataclass_default_factory_excludes(self):
"""Test that default exclude_patterns is empty list."""
criteria = FunctionFilterCriteria()
assert criteria.exclude_patterns == [] # 499ns -> 463ns (7.78% faster)
assert criteria.matches_exclude_patterns("anything") is False
def test_multiple_criteria_instances_independent(self):
"""Test that multiple FunctionFilterCriteria instances are independent."""
criteria1 = FunctionFilterCriteria(exclude_patterns=["test_*"])
criteria2 = FunctionFilterCriteria(exclude_patterns=["debug_*"])
assert criteria1.matches_exclude_patterns("test_func") is True # 2.45μs -> 1.47μs (66.4% faster)
assert criteria1.matches_exclude_patterns("debug_func") is False # 941ns -> 593ns (58.7% faster)
assert criteria2.matches_exclude_patterns("test_func") is False # 671ns -> 351ns (91.2% faster)
assert criteria2.matches_exclude_patterns("debug_func") is True # 960ns -> 500ns (92.0% faster)
def test_initialization_with_other_parameters(self):
"""Test that matches_exclude_patterns works regardless of other parameters."""
criteria = FunctionFilterCriteria(
include_patterns=["include_*"],
exclude_patterns=["exclude_*"],
require_return=False,
require_export=False,
include_async=False,
include_methods=False,
min_lines=5,
max_lines=100
)
assert criteria.matches_exclude_patterns("exclude_func") is True # 2.63μs -> 1.66μs (58.5% faster)
assert criteria.matches_exclude_patterns("include_func") is False # 959ns -> 488ns (96.5% faster)from codeflash.languages.base import FunctionFilterCriteria
def test_FunctionFilterCriteria_matches_exclude_patterns():
FunctionFilterCriteria.matches_exclude_patterns(FunctionFilterCriteria(include_patterns=[], exclude_patterns=[''], require_return=False, require_export=False, include_async=True, include_methods=False, min_lines=0, max_lines=0), '')
def test_FunctionFilterCriteria_matches_exclude_patterns_2():
FunctionFilterCriteria.matches_exclude_patterns(FunctionFilterCriteria(include_patterns=[], exclude_patterns=[], require_return=False, require_export=False, include_async=True, include_methods=False, min_lines=0, max_lines=0), '')🔎 Click to see Concolic Coverage Tests
To test or edit this optimization locally git merge codeflash/optimize-pr1199-2026-03-13T04.01.54
| if not self._exclude_regexes: | |
| return False | |
| return any(regex.match(name) for regex in self._exclude_regexes) | |
| for regex in self._exclude_regexes: | |
| if regex.match(name): | |
| return True | |
| return False |
⚡️ Codeflash found optimizations for this PR📄 10% (0.10x) speedup for
|
| original_sqlite = get_run_tmp_file(Path("test_return_values_0.sqlite")) | ||
| candidate_sqlite = get_run_tmp_file(Path(f"test_return_values_{optimization_candidate_index}.sqlite")) |
There was a problem hiding this comment.
⚡️Codeflash found 34% (0.34x) speedup for JavaFunctionOptimizer.compare_candidate_results in codeflash/languages/java/function_optimizer.py
⏱️ Runtime : 8.56 milliseconds → 6.39 milliseconds (best of 119 runs)
📝 Explanation and details
The optimization caches tmpdir_path after the first call to get_run_tmp_file instead of calling it twice per invocation, then constructs Path objects directly via division (tmpdir_path / "test_return_values_0.sqlite"). Line profiler shows get_run_tmp_file dropped from 15.2 ms (1949 hits) to 4.5 ms (505 hits), and compare_candidate_results total time fell from 33.6 ms to 21.9 ms, yielding a 34% runtime speedup with no behavioral changes.
✅ Correctness verification report:
| Test | Status |
|---|---|
| ⚙️ Existing Unit Tests | 🔘 None Found |
| 🌀 Generated Regression Tests | ✅ 518 Passed |
| ⏪ Replay Tests | 🔘 None Found |
| 🔎 Concolic Coverage Tests | 🔘 None Found |
| 📊 Tests Coverage | 100.0% |
🌀 Click to see Generated Regression Tests
import os
import shutil
import tempfile
from pathlib import Path
import codeflash.verification.equivalence as equivalence_module
import pytest # used for our unit tests
# Import the real classes and functions from the project under test
from codeflash.code_utils.code_utils import get_run_tmp_file
from codeflash.languages.java.function_optimizer import JavaFunctionOptimizer
from codeflash.models.models import (OriginalCodeBaseline, TestDiff,
TestDiffScope, TestResults)
from codeflash.verification.equivalence import compare_test_results
# Helper to create a "bare" JavaFunctionOptimizer instance without invoking its heavy __init__
# We use __new__ to allocate the instance and then set only the attributes needed by compare_candidate_results.
def make_optimizer_with_attrs(project_root: Path, language_support_obj=None) -> JavaFunctionOptimizer:
# Create instance without running __init__
opt = JavaFunctionOptimizer.__new__(JavaFunctionOptimizer)
# The method under test only needs .project_root and .language_support attributes.
opt.project_root = project_root
opt.language_support = language_support_obj
return opt
def test_compare_candidate_results_fallback_empty_results_returns_false_and_no_diffs(tmp_path: Path):
"""
Basic test:
When there are no temporary sqlite result files present, the function should
fall back to the in-memory compare_test_results implementation. If both
baseline and candidate TestResults are empty, compare_test_results should
indicate they are not equivalent (False) and return an empty diff list.
"""
# Prepare a small project_root for the optimizer instance
project_root = tmp_path
# Create a JavaFunctionOptimizer instance with minimal attributes.
# language_support is not used in the non-sqlite path, so set to None.
optimizer = make_optimizer_with_attrs(project_root=project_root, language_support_obj=None)
# Construct baseline OriginalCodeBaseline and candidate TestResults with empty lists.
# OriginalCodeBaseline requires several fields; provide minimal valid values.
baseline = OriginalCodeBaseline(
behavior_test_results=TestResults(test_results=[], test_result_idx={}),
benchmarking_test_results=TestResults(test_results=[], test_result_idx={}),
line_profile_results={},
runtime=0,
coverage_results=None,
)
candidate_results = TestResults(test_results=[], test_result_idx={})
# Ensure no temp sqlite files exist (force a clean temp dir for get_run_tmp_file)
# get_run_tmp_file uses an internal TemporaryDirectory stored on the function object.
# The files it checks are:
# - test_return_values_0.sqlite
# - test_return_values_{optimization_candidate_index}.sqlite
# We'll ensure neither exists.
orig_sqlite = get_run_tmp_file(Path("test_return_values_0.sqlite"))
cand_sqlite = get_run_tmp_file(Path("test_return_values_1.sqlite"))
orig_sqlite.unlink(missing_ok=True)
cand_sqlite.unlink(missing_ok=True)
# Call the method under test. Since the files do not exist, it will call the
# fallback compare_test_results (equivalence.compare_test_results) with the
# provided TestResults objects.
matched, diffs = optimizer.compare_candidate_results(baseline, candidate_results, optimization_candidate_index=1) # 16.7μs -> 13.5μs (23.8% faster)
# Expectation: both TestResults are empty => not equivalent and no diffs
assert matched is False, "Empty test results should not be considered equivalent"
assert isinstance(diffs, list), "diffs should be a list"
assert diffs == [], "Expected an empty list of diffs when both test results are empty"
def test_compare_candidate_results_sqlite_branch_calls_language_support_and_cleans_candidate(tmp_path: Path):
"""
Edge test:
When temporary sqlite files exist, JavaFunctionOptimizer.compare_candidate_results should
call self.language_support.compare_test_results(original_sqlite, candidate_sqlite, project_root=...)
and afterward remove the candidate sqlite file. We patch a real module function (not a Mock object)
on the equivalence module and reuse the module object as the language_support to satisfy the call.
"""
# Prepare a project_root and optimizer instance.
project_root = tmp_path
optimizer = make_optimizer_with_attrs(project_root=project_root, language_support_obj=None)
# We'll monkeypatch an attribute on the equivalence_module (a real module object)
# to act as the language_support implementation. This avoids creating mock classes
# or SimpleNamespace objects, and uses a real module object as the attribute holder.
saved_language_support_compare = getattr(equivalence_module, "compare_test_results", None)
# Prepare two temp sqlite files using get_run_tmp_file (same mechanism used by the implementation)
original_sqlite = get_run_tmp_file(Path("test_return_values_0.sqlite"))
candidate_index = 42
candidate_sqlite = get_run_tmp_file(Path(f"test_return_values_{candidate_index}.sqlite"))
try:
# Ensure both files exist on disk. Write minimal content so they are real files.
original_sqlite.parent.mkdir(parents=True, exist_ok=True)
original_sqlite.write_bytes(b"orig-sqlite")
candidate_sqlite.write_bytes(b"candidate-sqlite")
# Define a simple replacement function that matches the call signature used by compare_candidate_results
# (original_sqlite: Path, candidate_sqlite: Path, project_root: Path | None)
# Return a deterministic result that we can assert is propagated back.
def patched_compare(original_path: Path, candidate_path: Path, project_root: Path | None = None):
# Check that the function receives the expected paths
assert original_path == original_sqlite
assert candidate_path == candidate_sqlite
# Return a True match and a single TestDiff item.
td = TestDiff(scope=TestDiffScope.DID_PASS, original_pass=True, candidate_pass=True)
return True, [td]
# Monkeypatch the module's compare_test_results and set the optimizer's language_support
equivalence_module.compare_test_results = patched_compare
optimizer.language_support = equivalence_module # module has the patched function
# Construct a minimal baseline and candidate to pass into compare_candidate_results.
# They will be ignored because the sqlite files exist and the sqlite branch is taken.
baseline = OriginalCodeBaseline(
behavior_test_results=TestResults(test_results=[], test_result_idx={}),
benchmarking_test_results=TestResults(test_results=[], test_result_idx={}),
line_profile_results={},
runtime=0,
coverage_results=None,
)
candidate_results = TestResults(test_results=[], test_result_idx={})
# Call the function under test.
matched, diffs = optimizer.compare_candidate_results(
baseline, candidate_results, optimization_candidate_index=candidate_index
)
# Validate that the patched function's returned values were propagated.
assert matched is True, "The patched language_support.compare_test_results should determine match=True"
assert isinstance(diffs, list) and len(diffs) == 1, "We expected a single TestDiff returned by patched function"
# Candidate sqlite file should have been removed by compare_candidate_results
assert not candidate_sqlite.exists(), "Candidate sqlite file should be unlinked (deleted) after comparison"
# Original sqlite file should remain (implementation only unlinks candidate_sqlite)
assert original_sqlite.exists(), "Original sqlite file should remain after comparison"
finally:
# Restore the original module function to avoid side effects on other tests
if saved_language_support_compare is None:
# delete our attribute
try:
del equivalence_module.compare_test_results
except Exception:
pass
else:
equivalence_module.compare_test_results = saved_language_support_compare
# Cleanup any files if still present
original_sqlite.unlink(missing_ok=True)
candidate_sqlite.unlink(missing_ok=True)
def test_compare_candidate_results_many_iterations_sqlite_cleanup_and_invocations(tmp_path: Path):
"""
Large-scale test:
Repeatedly create candidate sqlite files for increasing optimization indices and call
compare_candidate_results to ensure the sqlite-branch remains deterministic and performs
cleanup reliably across many iterations.
We reuse the same patched compare function as in the edge test but do many iterations
(up to 1000) to exercise looped behavior and filesystem churn.
"""
# Choose a number of iterations up to 1000 (as requested). Keep reasonably fast for CI.
ITERATIONS = 500 # 500 is within requested bounds and is a substantial stress test
project_root = tmp_path
optimizer = make_optimizer_with_attrs(project_root=project_root, language_support_obj=None)
# Setup the original sqlite file that should exist for every iteration
original_sqlite = get_run_tmp_file(Path("test_return_values_0.sqlite"))
original_sqlite.parent.mkdir(parents=True, exist_ok=True)
original_sqlite.write_bytes(b"orig-sqlite")
# Patch the equivalence module compare_test_results similarly to the previous test
saved_compare = getattr(equivalence_module, "compare_test_results", None)
call_count = 0
def patched_compare_count(original_path: Path, candidate_path: Path, project_root: Path | None = None):
nonlocal call_count
# Basic sanity checks about inputs
assert original_path == original_sqlite
assert candidate_path.exists()
call_count += 1
# Always return False with no diffs (simulate a mismatch)
return False, []
try:
equivalence_module.compare_test_results = patched_compare_count
optimizer.language_support = equivalence_module
baseline = OriginalCodeBaseline(
behavior_test_results=TestResults(test_results=[], test_result_idx={}),
benchmarking_test_results=TestResults(test_results=[], test_result_idx={}),
line_profile_results={},
runtime=0,
coverage_results=None,
)
candidate_results = TestResults(test_results=[], test_result_idx={})
for i in range(ITERATIONS):
# Create candidate sqlite file for this iteration
candidate_sqlite = get_run_tmp_file(Path(f"test_return_values_{i}.sqlite"))
candidate_sqlite.write_bytes(b"candidate-sqlite")
# Call method; since files exist, patched_compare_count must be invoked
matched, diffs = optimizer.compare_candidate_results(
baseline, candidate_results, optimization_candidate_index=i
)
# Our patched function returns False and no diffs
assert matched is False
assert diffs == []
# Candidate file should have been removed after call
assert not candidate_sqlite.exists(), f"Candidate sqlite file for index {i} should have been removed"
# Ensure patched function was called ITERATIONS times
assert call_count == ITERATIONS
finally:
# Restore original compare function
if saved_compare is None:
try:
del equivalence_module.compare_test_results
except Exception:
pass
else:
equivalence_module.compare_test_results = saved_compare
# Cleanup the original sqlite file
original_sqlite.unlink(missing_ok=True)To test or edit this optimization locally git merge codeflash/optimize-pr1199-2026-03-13T06.49.27
| original_sqlite = get_run_tmp_file(Path("test_return_values_0.sqlite")) | |
| candidate_sqlite = get_run_tmp_file(Path(f"test_return_values_{optimization_candidate_index}.sqlite")) | |
| # Cache tmpdir_path to avoid repeated initialization checks | |
| if not hasattr(get_run_tmp_file, "tmpdir_path"): | |
| get_run_tmp_file(Path("test_return_values_0.sqlite")) | |
| tmpdir_path = get_run_tmp_file.tmpdir_path | |
| original_sqlite = tmpdir_path / "test_return_values_0.sqlite" | |
| candidate_sqlite = tmpdir_path / f"test_return_values_{optimization_candidate_index}.sqlite" |
⚡️ Codeflash found optimizations for this PR📄 23% (0.23x) speedup for
|
⚡️ Codeflash found optimizations for this PR📄 11% (0.11x) speedup for
|
Merge Conflict Resolution — Full Validation ReportThis PR resolves 7 merge conflicts between 1. CI ChecksAll meaningful CI checks pass on this branch: Results: 22/25 pass
2. Full Local Test Suiteuv run pytest tests/ -x --timeout=120Result: 3558 passed, 56 skipped, 0 failures (294.49s) This covers all unit tests, integration tests, language-specific tests (Python, JS/TS, Java), setup tests, and discovery tests. Zero import errors, zero regressions. 3. Import Verification for All Conflicted ModulesEach conflicted file involved import restructuring. Verified all import paths resolve correctly: # Python init (cmd_init.py modular imports)
from codeflash.cli_cmds.cmd_init import init_codeflash, collect_setup_info
# ✅ OK
# Java init (init_java.py lazy imports repointed to new modules)
from codeflash.cli_cmds.init_java import init_java_project, collect_java_setup_info, JavaSetupInfo
# ✅ OK
# JS/TS init (init_javascript.py — ProjectLanguage enum, detect_project_language)
from codeflash.cli_cmds.init_javascript import init_js_project, detect_project_language, ProjectLanguage
# ✅ OK
# testgen review/repair endpoints (aiservice.py)
from codeflash.api.aiservice import AiServiceClient
assert hasattr(AiServiceClient, 'review_generated_tests')
assert hasattr(AiServiceClient, 'repair_generated_tests')
# ✅ OK
# GitHub workflow Java support (ported from omni-java's cmd_init.py into main's github_workflow.py)
from codeflash.cli_cmds.github_workflow import install_github_actions, detect_project_language_for_workflow
# ✅ OKAll 5 import checks pass. 4. Java E2E — Fibonacci Optimizationcd code_to_optimize/java/
export CODEFLASH_CFAPI_SERVER="local"
export CODEFLASH_AIS_SERVER="local"
uv run codeflash --file src/main/java/com/example/Fibonacci.java --function fibonacci --verbose --no-prResult: PASS
5. Java E2E — Aerospike encodedLengthcd aerospike-client-java/
export CODEFLASH_CFAPI_SERVER="local"
export CODEFLASH_AIS_SERVER="local"
uv run codeflash --file client/src/com/aerospike/client/util/Utf8.java --function encodedLength --verboseResult: PASS (pipeline correct, no optimization accepted — speedup too small)
6. Python E2E — BubbleSort Sortercd code_to_optimize/
export CODEFLASH_CFAPI_SERVER="local"
export CODEFLASH_AIS_SERVER="local"
uv run --no-project codeflash/main.py --file bubble_sort.py --function sorter --no-pr --verbose --tests-root tests --module-root .Result: PARTIAL PASS — pipeline correct, AI test quality issue (not merge-related)
7. JavaScript E2E — Fibonacci (CommonJS)cd code_to_optimize/js/code_to_optimize_js_cjs/
export CODEFLASH_CFAPI_SERVER="local"
export CODEFLASH_AIS_SERVER="local"
uv run --no-project codeflash/main.py --file fibonacci.js --function fibonacci --no-pr --verbose --yesResult: PASS — full pipeline ran end-to-end
Validation Evidence Summary
Conclusion: All 7 conflict resolutions validated. No regressions found. All three language pipelines (Python, Java, JavaScript) confirmed working end-to-end. This PR is ready for merge. |
No description provided.