Skip to content

⚡️ Speed up function set_custom_labels by 49%#41

Open
codeflash-ai[bot] wants to merge 1 commit intomainfrom
codeflash/optimize-set_custom_labels-mgzf1xta
Open

⚡️ Speed up function set_custom_labels by 49%#41
codeflash-ai[bot] wants to merge 1 commit intomainfrom
codeflash/optimize-set_custom_labels-mgzf1xta

Conversation

@codeflash-ai
Copy link
Copy Markdown

@codeflash-ai codeflash-ai bot commented Oct 20, 2025

📄 49% (0.49x) speedup for set_custom_labels in pr_agent/algo/utils.py

⏱️ Runtime : 1.17 milliseconds 787 microseconds (best of 86 runs)

📝 Explanation and details

The optimization achieves a 48% speedup by addressing two key performance bottlenecks:

1. Reduced Settings Lookups
The original code called get_settings() twice per function invocation. The optimized version caches the result in a local variable settings, eliminating redundant context lookups and exception handling. This saves ~34% of the original execution time based on the profiler data.

2. Efficient String Building for Large Label Sets
The critical optimization replaces repeated string concatenation (+=) with list accumulation and a single join(). String concatenation in Python creates new string objects each time, leading to O(N²) complexity. The optimized approach:

  • Collects formatted strings in custom_labels_lines list
  • Performs single ''.join(custom_labels_lines) operation
  • Reduces the most expensive line from 20.9% to 11.5% of total execution time

3. Eliminated Redundant Operations

  • Removed unused counter variable
  • Cached k.lower().replace(' ', '_') computation as key_minimal

Performance Impact by Test Scale:

  • Small datasets (1-2 labels): 15-28% faster
  • Medium datasets (dozens of labels): 36-45% faster
  • Large datasets (100+ labels): 44-92% faster

The optimization is particularly effective for applications with many custom labels, where string building dominates execution time. The list+join pattern scales linearly versus the quadratic growth of repeated concatenation.

Correctness verification report:

Test Status
⚙️ Existing Unit Tests 🔘 None Found
🌀 Generated Regression Tests 21 Passed
⏪ Replay Tests 🔘 None Found
🔎 Concolic Coverage Tests 🔘 None Found
📊 Tests Coverage 100.0%
🌀 Generated Regression Tests and Runtime
from typing import Any, Dict

# imports
import pytest
from pr_agent.algo.utils import set_custom_labels


class DummySettings:
    def __init__(self, enable_custom_labels=True, custom_labels=None):
        self.config = type('cfg', (), {"enable_custom_labels": enable_custom_labels})()
        self._custom_labels = custom_labels

    def get(self, key, default):
        if key == 'custom_labels':
            return self._custom_labels if self._custom_labels is not None else default
        return default
from pr_agent.algo.utils import set_custom_labels

# -------- BASIC TEST CASES --------

def test_default_labels_are_set_when_no_custom_labels():
    # Test that default labels are set when custom_labels is None or empty
    variables = {}
    set_custom_labels(variables) # 234μs -> 233μs (0.490% faster)
    # Should list all default labels
    for label in ['Bug fix', 'Tests', 'Bug fix with tests', 'Enhancement', 'Documentation', 'Other']:
        pass










def test_custom_labels_with_none_description(monkeypatch):
    # Should handle None description gracefully (convert to empty string)
    custom_labels = {
        "NoneDesc": {"description": None}
    }
    # Patch the function to handle None gracefully
    original_func = set_custom_labels
    def patched_set_custom_labels(variables, git_provider=None):
        labels = {"NoneDesc": {"description": None}}
        variables["custom_labels_class"] = "class Label(str, Enum):"
        for k, v in labels.items():
            desc = v['description'] if v['description'] is not None else ""
            description = "'" + str(desc).strip('\n').replace('\n', '\\n') + "'"
            variables["custom_labels_class"] += f"\n    {k.lower().replace(' ', '_')} = {description}"
            variables["labels_minimal_to_labels_dict"] = {k.lower().replace(' ', '_'): k}
    monkeypatch.setattr(__name__ + ".set_custom_labels", patched_set_custom_labels)
    variables = {}
    set_custom_labels(variables) # 2.69μs -> 2.66μs (1.13% faster)
    # Restore original
    monkeypatch.setattr(__name__ + ".set_custom_labels", original_func)

# -------- LARGE SCALE TEST CASES --------





#------------------------------------------------
import pr_agent.algo.utils as utils
# imports
import pytest
from pr_agent.algo.utils import set_custom_labels

# --- Test Fixtures and Helpers ---

class DummySettings:
    """A dummy settings object to simulate Dynaconf behavior for tests."""
    def __init__(self, enable_custom_labels=True, custom_labels=None):
        self.config = type("Config", (), {"enable_custom_labels": enable_custom_labels})
        self._custom_labels = custom_labels

    def get(self, key, default=None):
        if key == 'custom_labels':
            return self._custom_labels if self._custom_labels is not None else default
        return default


@pytest.fixture(autouse=True)
def patch_get_settings(monkeypatch):
    """Patch get_settings to allow per-test configuration."""
    def _patch(enable_custom_labels=True, custom_labels=None):
        monkeypatch.setattr(utils, "get_settings", lambda: DummySettings(enable_custom_labels, custom_labels))
    return _patch

# --- Basic Test Cases ---

def test_disable_custom_labels_does_nothing(patch_get_settings):
    """If enable_custom_labels is False, variables should remain unchanged."""
    patch_get_settings(enable_custom_labels=False)
    variables = {}
    set_custom_labels(variables) # 12.1μs -> 12.3μs (1.47% slower)

def test_default_labels_are_set_when_custom_labels_missing(patch_get_settings):
    """If no custom_labels are set, default labels should be assigned."""
    patch_get_settings(enable_custom_labels=True, custom_labels=None)
    variables = {}
    set_custom_labels(variables) # 15.7μs -> 10.9μs (44.1% faster)
    default_labels = ['Bug fix', 'Tests', 'Bug fix with tests', 'Enhancement', 'Documentation', 'Other']
    for label in default_labels:
        pass

def test_empty_custom_labels_fallbacks_to_default(patch_get_settings):
    """If custom_labels is an empty dict, default labels should be assigned."""
    patch_get_settings(enable_custom_labels=True, custom_labels={})
    variables = {}
    set_custom_labels(variables) # 14.1μs -> 10.4μs (36.0% faster)

def test_single_custom_label(patch_get_settings):
    """Test with a single custom label."""
    patch_get_settings(enable_custom_labels=True, custom_labels={
        "Foo": {"description": "A foo label"}
    })
    variables = {}
    set_custom_labels(variables) # 14.6μs -> 11.4μs (27.7% faster)

def test_multiple_custom_labels(patch_get_settings):
    """Test with multiple custom labels."""
    patch_get_settings(enable_custom_labels=True, custom_labels={
        "Alpha": {"description": "First label"},
        "Beta": {"description": "Second label"}
    })
    variables = {}
    set_custom_labels(variables) # 14.3μs -> 11.9μs (20.7% faster)

def test_custom_label_with_spaces_and_case(patch_get_settings):
    """Custom label keys with spaces and mixed case should be normalized to snake_case, lower."""
    patch_get_settings(enable_custom_labels=True, custom_labels={
        "My Label": {"description": "Desc"}
    })
    variables = {}
    set_custom_labels(variables) # 13.0μs -> 11.3μs (15.4% faster)

# --- Edge Test Cases ---

def test_custom_label_with_newlines_in_description(patch_get_settings):
    """Descriptions with newlines should be escaped as '\\n'."""
    patch_get_settings(enable_custom_labels=True, custom_labels={
        "Test": {"description": "Line1\nLine2"}
    })
    variables = {}
    set_custom_labels(variables) # 13.1μs -> 11.2μs (16.9% faster)

def test_custom_label_with_leading_trailing_newlines(patch_get_settings):
    """Descriptions with leading/trailing newlines should be stripped."""
    patch_get_settings(enable_custom_labels=True, custom_labels={
        "Test": {"description": "\nDesc\n"}
    })
    variables = {}
    set_custom_labels(variables) # 13.1μs -> 11.0μs (19.1% faster)

def test_custom_label_with_empty_description(patch_get_settings):
    """Description can be empty string."""
    patch_get_settings(enable_custom_labels=True, custom_labels={
        "Empty": {"description": ""}
    })
    variables = {}
    set_custom_labels(variables) # 12.8μs -> 11.0μs (16.1% faster)

def test_custom_label_with_special_characters(patch_get_settings):
    """Label and description with special characters should be handled."""
    patch_get_settings(enable_custom_labels=True, custom_labels={
        "Sp@ce!": {"description": "D$sc!@#"}
    })
    variables = {}
    set_custom_labels(variables) # 12.6μs -> 10.5μs (20.0% faster)

def test_custom_label_with_duplicate_minimal_keys(patch_get_settings):
    """If two labels normalize to same minimal key, last one wins in dict."""
    patch_get_settings(enable_custom_labels=True, custom_labels={
        "Foo Bar": {"description": "First"},
        "foo_bar": {"description": "Second"}
    })
    variables = {}
    set_custom_labels(variables) # 13.8μs -> 11.6μs (19.1% faster)

def test_custom_label_with_non_str_description(patch_get_settings):
    """If description is not a string, should raise or handle gracefully."""
    patch_get_settings(enable_custom_labels=True, custom_labels={
        "Num": {"description": 123}
    })
    variables = {}
    # Should raise AttributeError when calling .strip('\n')
    with pytest.raises(AttributeError):
        set_custom_labels(variables) # 12.3μs -> 10.7μs (14.7% faster)

def test_custom_label_with_none_description(patch_get_settings):
    """If description is None, should raise or handle gracefully."""
    patch_get_settings(enable_custom_labels=True, custom_labels={
        "NoneDesc": {"description": None}
    })
    variables = {}
    with pytest.raises(AttributeError):
        set_custom_labels(variables) # 12.1μs -> 10.4μs (16.3% faster)

def test_custom_label_with_empty_key(patch_get_settings):
    """Empty string as label key should be handled."""
    patch_get_settings(enable_custom_labels=True, custom_labels={
        "": {"description": "Empty key"}
    })
    variables = {}
    set_custom_labels(variables) # 12.4μs -> 10.6μs (17.3% faster)

def test_custom_label_with_only_spaces_key(patch_get_settings):
    """Label key with only spaces should normalize to underscores."""
    patch_get_settings(enable_custom_labels=True, custom_labels={
        "   ": {"description": "Spaces"}
    })
    variables = {}
    set_custom_labels(variables) # 13.3μs -> 10.8μs (22.9% faster)

# --- Large Scale Test Cases ---


def test_large_labels_with_long_descriptions(patch_get_settings):
    """Test with large labels and long descriptions."""
    n = 100
    long_desc = "A" * 500  # 500 chars
    custom_labels = {f"Label{i}": {"description": long_desc} for i in range(n)}
    patch_get_settings(enable_custom_labels=True, custom_labels=custom_labels)
    variables = {}
    set_custom_labels(variables) # 115μs -> 60.3μs (91.9% faster)

def test_large_labels_with_newlines_in_descriptions(patch_get_settings):
    """Test with large number of labels and newlines in descriptions."""
    n = 100
    custom_labels = {f"Label{i}": {"description": f"Line1\nLine2{i}\nEnd"} for i in range(n)}
    patch_get_settings(enable_custom_labels=True, custom_labels=custom_labels)
    variables = {}
    set_custom_labels(variables) # 56.1μs -> 38.8μs (44.7% faster)

def test_large_labels_with_varied_keys(patch_get_settings):
    """Test with large number of labels with varied key formats."""
    n = 200
    custom_labels = {}
    for i in range(n):
        if i % 4 == 0:
            key = f"Label {i}"
        elif i % 4 == 1:
            key = f"LABEL_{i}"
        elif i % 4 == 2:
            key = f"label{i}"
        else:
            key = f"LabEl{i}"
        custom_labels[key] = {"description": f"Desc{i}"}
    patch_get_settings(enable_custom_labels=True, custom_labels=custom_labels)
    variables = {}
    set_custom_labels(variables) # 93.7μs -> 60.8μs (54.0% faster)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.

To edit these changes git checkout codeflash/optimize-set_custom_labels-mgzf1xta and push.

Codeflash

The optimization achieves a **48% speedup** by addressing two key performance bottlenecks:

**1. Reduced Settings Lookups**
The original code called `get_settings()` twice per function invocation. The optimized version caches the result in a local variable `settings`, eliminating redundant context lookups and exception handling. This saves ~34% of the original execution time based on the profiler data.

**2. Efficient String Building for Large Label Sets** 
The critical optimization replaces repeated string concatenation (`+=`) with list accumulation and a single `join()`. String concatenation in Python creates new string objects each time, leading to O(N²) complexity. The optimized approach:
- Collects formatted strings in `custom_labels_lines` list
- Performs single `''.join(custom_labels_lines)` operation
- Reduces the most expensive line from 20.9% to 11.5% of total execution time

**3. Eliminated Redundant Operations**
- Removed unused `counter` variable
- Cached `k.lower().replace(' ', '_')` computation as `key_minimal`

**Performance Impact by Test Scale:**
- Small datasets (1-2 labels): 15-28% faster
- Medium datasets (dozens of labels): 36-45% faster  
- Large datasets (100+ labels): 44-92% faster

The optimization is particularly effective for applications with many custom labels, where string building dominates execution time. The list+join pattern scales linearly versus the quadratic growth of repeated concatenation.
@codeflash-ai codeflash-ai bot requested a review from mashraf-222 October 20, 2025 17:35
@codeflash-ai codeflash-ai bot added the ⚡️ codeflash Optimization PR opened by Codeflash AI label Oct 20, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

⚡️ codeflash Optimization PR opened by Codeflash AI

Projects

None yet

Development

Successfully merging this pull request may close these issues.

0 participants