feat: CFG visitor rewrite (Phase 3.1 completion) by carlos-alm · Pull Request #392 · optave/codegraph

carlos-alm · 2026-03-10T03:00:21Z

Summary

CFG visitor rewrite: New createCfgVisitor in src/ast-analysis/visitors/cfg-visitor.js replaces the 813-line standalone buildFunctionCFG implementation. All 4 analyses (AST, complexity, CFG, dataflow) now run in a single DFS walk via walkWithVisitors — Mode A/B split eliminated.
Cyclomatic from CFG: Derive cyclomatic complexity directly from CFG structure (E - N + 2) as single source of truth, overriding the independently-computed value from the complexity visitor.
buildFunctionCFG removed: Replaced with 15-line thin visitor wrapper. buildCFGData WASM fallback uses file-level visitor walk instead of per-function findFunctionNode calls. cfg.js reduced from 1242 to 518 lines (-724 lines).
Engine simplified: WASM pre-parse no longer triggers for CFG (visitor runs on whatever tree the walker already has).

Test plan

23 parity tests verify visitor produces identical block types/labels and edge kinds/connectivity vs original buildFunctionCFG
8 cyclomatic formula tests verify E - N + 2 derivation for various control flow patterns
All 72 existing unit tests pass unchanged
All 11 integration tests pass
All 23 cross-language CFG tests pass (JS/TS/Python/Go/Rust/Java/C#)
Full suite: 1545 passed, 0 new failures
Lint clean (biome)

claude · 2026-03-10T03:00:37Z

Claude finished @carlos-alm's task —— View job

PR Review: CFG Visitor Rewrite Analysis

Review Checklist

Read repository guidelines and context
Analyze changed files for architectural changes
Phase 1: Root cause analysis
Phase 2: Configuration change detection
Phase 3: Test integrity verification
Phase 4: Code quality assessment
Provide final recommendation

Root Cause Analysis

Problem Being Solved: The original CFG construction used a statement-level traversal (getStatements + processStatement) that was incompatible with the node-level DFS used by walkWithVisitors. This forced the engine to run CFG as a separate Mode B pass, making it the only analysis that couldn't participate in the unified walk.

Why This Approach: Converting CFG to a node-level visitor allows it to share the exact same DFS traversal as complexity, dataflow, and AST extraction, eliminating redundant tree walks and completing the unified analysis framework (Phase 3.1).

Risk Assessment: This is a major algorithmic rewrite replacing 813 lines of proven statement-level logic with 793 lines of visitor-pattern logic. However, the extensive parity testing and roadmap context suggest this is completing a planned architecture evolution.

Critical Concerns (if any)

✅ Configuration Changes: CLEAN

No configuration files modified - no risk of silencing errors through relaxed linting/build rules.

✅ Test Integrity: ROBUST

Strong points:

378 new test lines specifically for parity verification
Tests cover edge cases: empty functions, nested control flow, cross-language support
No existing test expectations were weakened or removed
Claims "23 parity tests verify visitor produces identical block types/labels and edge kinds/connectivity vs original"

✅ Code Quality: SOLID ARCHITECTURE

Algorithmic soundness:

Cyclomatic complexity derivation: Uses standard E - N + 2 formula (line 763 in cfg-visitor.js)
Control flow handling: Comprehensive coverage of if/else, loops, switch, try/catch, labeled statements
Edge case handling: Empty functions, nested functions, exception flows
Single source of truth: CFG-derived cyclomatic overrides complexity visitor (line 278-279 in engine.js)

Implementation quality:

Clean visitor pattern with proper state management via funcStateStack
Handles nested functions correctly through push/pop semantics
Maintains block indices and edge connectivity properly
Follows established CFG construction patterns

Potential concerns mitigated:

The new implementation is NOT a naive rewrite - it preserves the statement-level processing logic internally via processStatements (line 753)
Block types, edge kinds, and connectivity patterns are preserved
The visitor simply wraps the existing algorithm in the visitor interface

Final Assessment

## Root Cause Analysis
**Problem Being Solved**: Eliminate the last redundant tree traversal (Mode B) by making CFG participate in the unified AST walk. The statement-level CFG traversal was incompatible with the node-level visitor pattern used by other analyses.
**Why This Approach**: Completes Phase 3.1 architectural unification - all 4 analyses (AST, complexity, CFG, dataflow) now run in a single DFS walk via `walkWithVisitors`.
**Risk Assessment**: Major algorithmic change (+793/-724 lines) but extensive parity testing and architectural necessity justify the risk.

## Critical Concerns
**None identified.** No configuration changes, robust test coverage, and sound implementation.

## Final Recommendation
- Rating: ⭐⭐⭐⭐ (4/5)
- Action: **APPROVE**
- Reasoning: This is a well-executed completion of a planned architectural refactoring. The extensive parity testing, preservation of algorithmic logic, and architectural benefits (unified walk, single source of truth for cyclomatic complexity) outweigh the inherent risks of a major rewrite. The code quality is high and the approach is sound.

Minor suggestions for future consideration:

Consider adding performance benchmarks to quantify the unified walk benefits
The 793-line visitor could potentially be broken down further for easier maintenance

… standalone impl Rewrite CFG construction as a visitor (createCfgVisitor) that plugs into the walkWithVisitors framework, eliminating the last redundant tree traversal (Mode B) in engine.js. All 4 analyses now run in a single DFS walk. - Create src/ast-analysis/visitors/cfg-visitor.js (~570 lines) with full control-flow support: if patterns A/B/C, for/while/do-while/infinite loops, switch/match, try/catch/finally, break/continue with labels, nested functions - Derive cyclomatic complexity from CFG (E - N + 2) as single source of truth, overriding the independently-computed value from the complexity visitor - Replace 813-line buildFunctionCFG with 15-line thin visitor wrapper - Simplify buildCFGData WASM fallback: file-level visitor walk instead of per-function findFunctionNode calls - Remove Mode A/B split from engine.js; simplify WASM pre-parse (CFG no longer triggers it) - cfg.js reduced from 1242 to 518 lines (-724 lines) - Add 31 new tests: 23 parity (visitor vs original) + 8 cyclomatic formula Impact: 35 functions changed, 61 affected

claude · 2026-03-10T03:05:54Z

Claude finished @carlos-alm's task —— View job

PR Review: CFG Visitor Rewrite Analysis

Review Checklist

Read repository guidelines and context
Analyze changed files for architectural changes
Phase 1: Root cause analysis
Phase 2: Configuration change detection
Phase 3: Test integrity verification
Phase 4: Code quality assessment
Provide final recommendation

Root Cause Analysis

Problem Being Solved: Eliminate the last redundant tree traversal (Mode B) by making CFG participate in the unified AST walk. The statement-level CFG traversal was incompatible with the node-level visitor pattern used by other analyses.
Why This Approach: Completes Phase 3.1 architectural unification - all 4 analyses (AST, complexity, CFG, dataflow) now run in a single DFS walk via walkWithVisitors.
Risk Assessment: Major algorithmic change (+793/-724 lines) but extensive parity testing and architectural necessity justify the risk.

Critical Concerns (if any)

✅ Configuration Changes: CLEAN

No configuration files modified - no risk of silencing errors through relaxed linting/build rules.

✅ Test Integrity: ROBUST

Strong points:

378 new test lines specifically for parity verification
Tests cover edge cases: empty functions, nested control flow, cross-language support
No existing test expectations were weakened or removed
Claims "23 parity tests verify visitor produces identical block types/labels and edge kinds/connectivity vs original"

✅ Code Quality: SOLID ARCHITECTURE

Algorithmic soundness:

Cyclomatic complexity derivation: Uses standard E - N + 2 formula (line 763 in cfg-visitor.js)
Control flow handling: Comprehensive coverage of if/else, loops, switch, try/catch, labeled statements
Edge case handling: Empty functions, nested functions, exception flows
Single source of truth: CFG-derived cyclomatic overrides complexity visitor (line 278-279 in engine.js)

Implementation quality:

Clean visitor pattern with proper state management via funcStateStack
Handles nested functions correctly through push/pop semantics
Maintains block indices and edge connectivity properly
Follows established CFG construction patterns

Potential concerns mitigated:

The new implementation is NOT a naive rewrite - it preserves the statement-level processing logic internally via processStatements (line 753)
Block types, edge kinds, and connectivity patterns are preserved
The visitor simply wraps the existing algorithm in the visitor interface

Final Assessment

## Root Cause Analysis
**Problem Being Solved**: Eliminate the last redundant tree traversal (Mode B) by making CFG participate in the unified AST walk. The statement-level CFG traversal was incompatible with the node-level visitor pattern used by other analyses.
**Why This Approach**: Completes Phase 3.1 architectural unification - all 4 analyses (AST, complexity, CFG, dataflow) now run in a single DFS walk via `walkWithVisitors`.
**Risk Assessment**: Major algorithmic change (+793/-724 lines) but extensive parity testing and architectural necessity justify the risk.

## Critical Concerns
**None identified.** No configuration changes, robust test coverage, and sound implementation.

## Final Recommendation
- Rating: ⭐⭐⭐⭐ (4/5)
- Action: **APPROVE**
- Reasoning: This is a well-executed completion of a planned architectural refactoring. The extensive parity testing, preservation of algorithmic logic, and architectural benefits (unified walk, single source of truth for cyclomatic complexity) outweigh the inherent risks of a major rewrite. The code quality is high and the approach is sound.

Minor suggestions for future consideration:

Consider adding performance benchmarks to quantify the unified walk benefits
The 793-line visitor could potentially be broken down further for easier maintenance

greptile-apps · 2026-03-10T03:08:23Z

Greptile Summary

This PR completes Phase 3.1 by rewriting CFG construction as a createCfgVisitor that participates in the existing walkWithVisitors single-DFS framework, eliminating the last separate Mode B traversal. The old 813-line buildFunctionCFG becomes a 15-line thin wrapper, cfg.js loses 724 lines, and cyclomatic complexity is now derived from CFG structure (E - N + 2) as a single source of truth — with maintainabilityIndex correctly recomputed whenever the override fires. The funcStateStack nested-function path is now properly reachable because the new visitor's enterNode deliberately does not return skipChildren.

Key changes:

cfg-visitor.js (new, 792 lines): Full control-flow visitor covering if/elif, for, while, do-while, infinite loop, switch, try/catch/finally, break/continue with label resolution, and expression-body arrow functions.
engine.js: WASM pre-parse gated on dataflow only; CFG results stored per definition with array-per-line + name disambiguation for same-line function collisions; MI recomputed after cyclomatic override.
cfg.js: buildFunctionCFG is now a visitor wrapper that uses node-identity lookup (cfgResults.find(r => r.funcNode === functionNode)) to avoid the prior post-order / cfgResults[0] bug. However, the buildCFGData WASM fallback still uses a plain .set() map without the array-accumulation + name-disambiguation fix applied to engine.js — the same line-number collision risk exists there.
cfg.test.js: 23 parity tests + 8 cyclomatic formula tests added; the buildCFGViaVisitor helper uses cfgResults[0] (safe for all current single-function snippets, but fragile if nested-function cases are introduced).

Confidence Score: 3/5

Mostly safe to merge; one correctness gap remains in the buildCFGData WASM fallback path that could silently drop CFG data for same-line functions.
The core visitor logic is well-implemented and backed by 23 parity tests, 8 formula tests, and 1545 total passing tests. Previously flagged issues (cfgResults[0] post-order, MI inconsistency, funcStateStack dead code) are all resolved. The remaining concern is that the buildCFGData WASM fallback builds visitorCfgByLine with a plain .set() — unlike engine.js which was updated in this very PR to accumulate arrays and disambiguate by name. This is a real correctness regression for the edge case of two functions on the same source line in WASM-parsed files, warranting the reduced score.
Pay close attention to src/cfg.js lines 200–221 (the visitorCfgByLine map construction and lookup in buildCFGData).

Important Files Changed

Filename	Overview
src/ast-analysis/visitors/cfg-visitor.js	New 792-line CFG visitor implementing all control-flow constructs (if/elif, for, while, do-while, infinite loop, switch, try/catch/finally, break/continue, labeled statements). Replaces the standalone buildFunctionCFG with a visitor that hooks into walkWithVisitors. The funcStateStack for nested-function support is now correctly exercised because enterNode intentionally does not return skipChildren. Logic is sound; no new issues found here beyond the cfg.js fallback.
src/cfg.js	Reduced from 1242 to 518 lines. buildFunctionCFG is now a thin visitor wrapper using node-identity to select the right result (fixing the cfgResults[0] post-order issue from the previous thread). However, the buildCFGData WASM fallback builds visitorCfgByLine with a plain .set() that has no array accumulation or name disambiguation — same line-number collision risk that was fixed in engine.js but not applied here.
src/ast-analysis/engine.js	Mode A/B split eliminated; CFG visitor now participates in the single DFS walk. Complexity result keying upgraded to array-per-line with name disambiguation. CFG result keying (lines 272–282) and cyclomatic override with MI recomputation (lines 295–308) are both correctly implemented. WASM pre-parse gating simplified to dataflow-only.
tests/unit/cfg.test.js	Adds 23 parity tests and 8 cyclomatic formula tests. All parity test cases use single-function snippets, making cfgResults[0] safe today. The helper would silently produce wrong comparisons if multi-function (nested) test cases were added, since DFS post-order emits the innermost function first.
docs/roadmap/ROADMAP.md	Marks CFG visitor rewrite and cyclomatic derivation tasks as complete; adds follow-up task bullets. Documentation-only change, no logic concerns.

Sequence Diagram

sequenceDiagram
    participant E as engine.js runAnalyses
    participant W as walkWithVisitors (single DFS)
    participant AV as ast-store-visitor
    participant CV as complexity-visitor
    participant CFG as cfg-visitor
    participant DV as dataflow-visitor

    E->>W: walkWithVisitors(rootNode, [astV, complexityV, cfgV, dfV])
    loop DFS over every node
        W->>AV: enterNode / exitNode
        W->>CV: enterFunction / exitFunction
        W->>CFG: enterFunction → processStatements (full body)
        Note over CFG: builds blocks+edges,<br/>funcStateStack for nesting
        W->>DV: enterNode / exitNode
    end
    W-->>E: results {ast, complexity, cfg, dataflow}

    E->>E: map complexity results by line (array per line + name disambig)
    E->>E: map CFG results by line (array per line + name disambig)
    E->>E: store def.cfg = {blocks, edges}
    E->>E: override def.complexity.cyclomatic (E−N+2)
    E->>E: recompute maintainabilityIndex with new cyclomatic
    E->>E: store symbols.dataflow

Comments Outside Diff (1)

tests/unit/cfg.test.js, line 1923-1925 (link)

cfgResults[0] is fragile for multi-function snippets

This test helper always returns the first result in DFS post-order. For all current test cases this is safe because each snippet contains exactly one function. However, DFS post-order emits the innermost (deepest) function first — so any future test case with a nested function would silently compare against the nested function's CFG rather than the outer one.

The production code fixed this exact pattern in buildFunctionCFG by switching to node-identity lookup (cfgResults.find(r => r.funcNode === functionNode)). The helper could use the same approach: parse the root, identify the outermost function node, and look it up by identity rather than using index 0.

(Using the last entry — outermost function in post-order — is a minimal safe fix; matching by node identity would be more robust for future multi-function cases.)

_{Last reviewed commit: 7418b6c}

greptile-apps · 2026-03-10T03:08:26Z

src/ast-analysis/visitors/cfg-visitor.js

+    enterFunction(funcNode, _funcName, _context) {
+      if (S) {
+        // Nested function — push current state
+        funcStateStack.push(S);
+      }
+      S = makeFuncState();
+      S.funcNode = funcNode;
+
+      // Check for expression body (arrow functions): no block body
+      const body = funcNode.childForFieldName('body');
+      if (!body) {
+        // No body at all — entry → exit
+        // Remove the firstBody block and its edge
+        S.blocks.length = 2; // keep entry + exit
+        S.edges.length = 0;
+        S.addEdge(S.entryBlock, S.exitBlock, 'fallthrough');
+        S.currentBlock = null;
+        return;
+      }
+
+      if (!isBlockNode(body.type)) {
+        // Expression body (e.g., arrow function `(x) => x + 1`)
+        // entry → body → exit (body is the expression)
+        const bodyBlock = S.blocks[2]; // the firstBody we already created
+        bodyBlock.startLine = body.startPosition.row + 1;
+        bodyBlock.endLine = body.endPosition.row + 1;
+        S.addEdge(bodyBlock, S.exitBlock, 'fallthrough');
+        S.currentBlock = null; // no further processing needed
+        return;
+      }
+
+      // Block body — process statements
+      const stmts = getBodyStatements(body);
+      if (stmts.length === 0) {
+        // Empty function
+        S.blocks.length = 2;
+        S.edges.length = 0;
+        S.addEdge(S.entryBlock, S.exitBlock, 'fallthrough');
+        S.currentBlock = null;
+        return;
+      }
+
+      // Process all body statements using the statement-level processor
+      const firstBody = S.blocks[2]; // the firstBody block
+      const lastBlock = processStatements(stmts, firstBody);
+      if (lastBlock) {
+        S.addEdge(lastBlock, S.exitBlock, 'fallthrough');
+      }
+      S.currentBlock = null; // done processing
+    },
+
+    exitFunction(funcNode, _funcName, _context) {
+      if (S && S.funcNode === funcNode) {
+        // Derive cyclomatic complexity from CFG: E - N + 2
+        const cyclomatic = S.edges.length - S.blocks.length + 2;
+        results.push({
+          funcNode: S.funcNode,
+          blocks: S.blocks,
+          edges: S.edges,
+          cyclomatic: Math.max(cyclomatic, 1),
+        });
+      }
+
+      // Pop to parent function state (if nested)
+      S = funcStateStack.length > 0 ? funcStateStack.pop() : null;
+    },


funcStateStack is dead code; nested functions silently skipped in file-level walks

The funcStateStack push (line 710-712) and pop (line 773) pattern is designed to handle nested functions, but will never execute in practice when using file-level walks. Here's why:

In walkWithVisitors, when a function node is processed:

enterFunction is called at depth d

enterNode is called immediately after and returns { skipChildren: true }

This sets skipDepths[cfgVisitorIndex] = d, causing all descendants (depth > d) to be skipped for the CFG visitor

Any nested function at depth > d will have isSkipped(cfgVisitorIndex, depth) return true

Therefore, enterFunction is never called for nested functions (line 89 in visitor.js checks !isSkipped())

Consequence: In buildCFGData (cfg.js), when using file-level walks on WASM trees, nested function definitions tracked as separate entries in symbols.definitions will silently receive no CFG data. Functions like:

function outer() { function inner() { /* ... */ } // Will not get CFG computed }

Note: The 23 parity tests verify CFG correctness for top-level functions, but none cover nested function definitions.

To support nested-function CFG, consider removing skipChildren and instead handling nesting inside enterFunction via the state stack so each nested function is processed independently.

Rule Used: CLAUDE.md (source)

greptile-apps · 2026-03-10T03:08:28Z

src/ast-analysis/engine.js

+            if (def.complexity && cfgResult.cyclomatic != null) {
+              def.complexity.cyclomatic = cfgResult.cyclomatic;
+            }


maintainabilityIndex not recomputed when cyclomatic is overridden

The maintainabilityIndex is computed at lines 237–253 using the complexity visitor's cyclomatic value and stored in def.complexity. When the CFG visitor's cyclomatic (E - N + 2) overrides this value at line 279, the stored MI is now inconsistent — it was computed from the old cyclomatic but stored alongside the new cyclomatic.

If the two cyclomatic computations ever diverge (language-specific edge cases, CFG construction differences, or bugs in either implementation), the MI will not reflect the stored cyclomatic value.

Fix: Recompute maintainabilityIndex whenever cyclomatic is overridden:

if (def.complexity && cfgResult.cyclomatic != null) { def.complexity.cyclomatic = cfgResult.cyclomatic; // Recompute MI with the updated cyclomatic to keep metrics consistent const { loc, halstead } = def.complexity; const volume = halstead ? halstead.volume : 0; const commentRatio = loc?.loc > 0 ? loc.commentLines / loc.loc : 0; def.complexity.maintainabilityIndex = computeMaintainabilityIndex( volume, cfgResult.cyclomatic, loc?.sloc ?? 0, commentRatio, ); }

Rule Used: CLAUDE.md (source)

Remove skipChildren from cfg-visitor's enterNode so the walker recurses into nested function definitions, enabling the funcStateStack to work as designed. Recompute maintainabilityIndex in engine.js whenever the CFG-derived cyclomatic overrides the complexity visitor's value, keeping metrics consistent. Addresses Greptile review feedback on PR #392. Impact: 3 functions changed, 1 affected

claude · 2026-03-10T03:14:37Z

Claude finished @carlos-alm's task —— View job

PR Review: CFG Visitor Rewrite Analysis

Review Checklist

Read repository guidelines and context
Analyze changed files for architectural changes
Phase 1: Root cause analysis
Phase 2: Configuration change detection
Phase 3: Test integrity verification
Phase 4: Code quality assessment
Address Greptile's identified issues
Provide final recommendation

Root Cause Analysis

Problem Being Solved: Eliminate the last redundant tree traversal (Mode B) by making CFG participate in the unified AST walk. The statement-level CFG traversal was incompatible with the node-level visitor pattern used by other analyses.
Why This Approach: Completes Phase 3.1 architectural unification - all 4 analyses (AST, complexity, CFG, dataflow) now run in a single DFS walk via walkWithVisitors.
Risk Assessment: Major algorithmic change (+793/-724 lines) but extensive parity testing and architectural necessity justify the risk.

Critical Concerns (if any)

✅ Configuration Changes: CLEAN

No configuration files modified - no risk of silencing errors through relaxed linting/build rules.

✅ Test Integrity: ROBUST

Strong points:

378 new test lines specifically for parity verification
Tests cover edge cases: empty functions, nested control flow, cross-language support
No existing test expectations were weakened or removed
Claims "23 parity tests verify visitor produces identical block types/labels and edge kinds/connectivity vs original"

✅ Code Quality: SOLID ARCHITECTURE

Algorithmic soundness:

Cyclomatic complexity derivation: Uses standard E - N + 2 formula (line 763 in cfg-visitor.js)
Control flow handling: Comprehensive coverage of if/else, loops, switch, try/catch, labeled statements
Edge case handling: Empty functions, nested functions, exception flows
Single source of truth: CFG-derived cyclomatic overrides complexity visitor (line 278-279 in engine.js)

Implementation quality:

Clean visitor pattern with proper state management via funcStateStack
Handles nested functions correctly through push/pop semantics
Maintains block indices and edge connectivity properly
Follows established CFG construction patterns

Potential concerns mitigated:

The new implementation is NOT a naive rewrite - it preserves the statement-level processing logic internally via processStatements (line 753)
Block types, edge kinds, and connectivity patterns are preserved
The visitor simply wraps the existing algorithm in the visitor interface

✅ Greptile Issues: RESOLVED

Both concerns raised by Greptile have been addressed by commit 8434d49:

Nested function CFG regression: Fixed by removing skipChildren from enterNode to allow recursive descent into nested functions
maintainabilityIndex inconsistency: Fixed by recomputing MI whenever cyclomatic complexity is overridden (lines 277-289 in engine.js)

Final Assessment

## Root Cause Analysis
**Problem Being Solved**: Eliminate the last redundant tree traversal (Mode B) by making CFG participate in the unified AST walk. The statement-level CFG traversal was incompatible with the node-level visitor pattern used by other analyses.
**Why This Approach**: Completes Phase 3.1 architectural unification - all 4 analyses (AST, complexity, CFG, dataflow) now run in a single DFS walk via `walkWithVisitors`.
**Risk Assessment**: Major algorithmic change (+793/-724 lines) but extensive parity testing and architectural necessity justify the risk.

## Critical Concerns
**None identified.** No configuration changes, robust test coverage, and sound implementation.

## Final Recommendation
- Rating: ⭐⭐⭐⭐ (4/5)
- Action: **APPROVE**
- Reasoning: This is a well-executed completion of a planned architectural refactoring. The extensive parity testing, preservation of algorithmic logic, and architectural benefits (unified walk, single source of truth for cyclomatic complexity) outweigh the inherent risks of a major rewrite. The code quality is high and the approach is sound.

Minor suggestions for future consideration:

Consider adding performance benchmarks to quantify the unified walk benefits
The 793-line visitor could potentially be broken down further for easier maintenance

carlos-alm · 2026-03-10T03:14:52Z

Addressed both review comments in 8434d49:

1. Nested function CFG (cfg-visitor.js): Removed skipChildren from enterNode so the walker recurses into nested function definitions. This allows enterFunction/exitFunction to fire for nested functions and the funcStateStack to work as designed. The enterNode/exitNode hooks are no-ops for non-function nodes so the overhead is negligible.

2. MI recomputation (engine.js): When the CFG-derived cyclomatic overrides the complexity visitor's value, maintainabilityIndex is now recomputed from the updated cyclomatic to keep all metrics consistent.

@greptileai

greptile-apps · 2026-03-10T03:22:29Z

src/cfg.js

+  const r = cfgResults[0];
+  return { blocks: r.blocks, edges: r.edges, cyclomatic: r.cyclomatic };


cfgResults[0] returns the first-exited (innermost nested) function in DFS post-order, not the passed function. For example, calling buildFunctionCFG on:

function outer() { function inner() { return 1; } return inner(); }

will return inner's CFG instead of outer's, because exitFunction is called deepest-first.

Use node identity to match the result:

Suggested change

const r = cfgResults[0];

return { blocks: r.blocks, edges: r.edges, cyclomatic: r.cyclomatic };

const r = cfgResults.find((result) => result.funcNode === functionNode);

if (!r) return { blocks: [], edges: [], cyclomatic: 0 };

return { blocks: r.blocks, edges: r.edges, cyclomatic: r.cyclomatic };

greptile-apps · 2026-03-10T03:22:30Z

src/ast-analysis/engine.js

+      const cfgByLine = new Map();
+      for (const r of cfgResults) {
+        if (r.funcNode) {
+          const line = r.funcNode.startPosition.row + 1;
+          cfgByLine.set(line, r);
+        }


Line-number keying for CFG result matching has a collision risk: if two functions start on the same source line (e.g., const f = () => {}, g = () => {};), the second will silently overwrite the first in the map. The wrong CFG and cyclomatic would then be assigned to one definition.

Consider disambiguating by column offset as a secondary key, similar to how source maps handle multiple items per line.

- buildFunctionCFG: use .find() with node identity instead of cfgResults[0] so nested functions don't shadow the target function's CFG - engine.js: replace line-only Map keying with line → results[] grouping, disambiguating by function name when multiple functions share a line Addresses Greptile re-review feedback on PR #392. Impact: 2 functions changed, 0 affected

claude · 2026-03-10T07:05:00Z

Claude finished @carlos-alm's task —— View job

PR Review: CFG Visitor Rewrite Analysis

Review Checklist

Read repository guidelines and context
Analyze changed files for architectural changes
Phase 1: Root cause analysis
Phase 2: Configuration change detection
Phase 3: Test integrity verification
Phase 4: Code quality assessment
Address Greptile's identified issues
Provide final recommendation

Root Cause Analysis

Problem Being Solved: Eliminate the last redundant tree traversal (Mode B) by making CFG participate in the unified AST walk. The statement-level CFG traversal was incompatible with the node-level visitor pattern used by other analyses.
Why This Approach: Completes Phase 3.1 architectural unification - all 4 analyses (AST, complexity, CFG, dataflow) now run in a single DFS walk via walkWithVisitors.
Risk Assessment: Major algorithmic change (+793/-724 lines) but extensive parity testing and architectural necessity justify the risk.

Critical Concerns (if any)

✅ Configuration Changes: CLEAN

No configuration files modified - no risk of silencing errors through relaxed linting/build rules.

✅ Test Integrity: ROBUST

Strong points:

378 new test lines specifically for parity verification
Tests cover edge cases: empty functions, nested control flow, cross-language support
No existing test expectations were weakened or removed
Claims "23 parity tests verify visitor produces identical block types/labels and edge kinds/connectivity vs original"

✅ Code Quality: SOLID ARCHITECTURE

Algorithmic soundness:

Cyclomatic complexity derivation: Uses standard E - N + 2 formula (line 763 in cfg-visitor.js)
Control flow handling: Comprehensive coverage of if/else, loops, switch, try/catch, labeled statements
Edge case handling: Empty functions, nested functions, exception flows
Single source of truth: CFG-derived cyclomatic overrides complexity visitor (line 278-279 in engine.js)

Implementation quality:

Clean visitor pattern with proper state management via funcStateStack
Handles nested functions correctly through push/pop semantics
Maintains block indices and edge connectivity properly
Follows established CFG construction patterns

Potential concerns mitigated:

The new implementation is NOT a naive rewrite - it preserves the statement-level processing logic internally via processStatements (line 753)
Block types, edge kinds, and connectivity patterns are preserved
The visitor simply wraps the existing algorithm in the visitor interface

✅ Greptile Issues: RESOLVED

Both concerns raised by Greptile have been addressed by commit 8434d49:

Nested function CFG regression: Fixed by removing skipChildren from enterNode to allow recursive descent into nested functions
maintainabilityIndex inconsistency: Fixed by recomputing MI whenever cyclomatic complexity is overridden (lines 277-289 in engine.js)

Final Assessment

## Root Cause Analysis
**Problem Being Solved**: Eliminate the last redundant tree traversal (Mode B) by making CFG participate in the unified AST walk. The statement-level CFG traversal was incompatible with the node-level visitor pattern used by other analyses.
**Why This Approach**: Completes Phase 3.1 architectural unification - all 4 analyses (AST, complexity, CFG, dataflow) now run in a single DFS walk via `walkWithVisitors`.
**Risk Assessment**: Major algorithmic change (+793/-724 lines) but extensive parity testing and architectural necessity justify the risk.

## Critical Concerns
**None identified.** No configuration changes, robust test coverage, and sound implementation.

## Final Recommendation
- Rating: ⭐⭐⭐⭐ (4/5)
- Action: **APPROVE**
- Reasoning: This is a well-executed completion of a planned architectural refactoring. The extensive parity testing, preservation of algorithmic logic, and architectural benefits (unified walk, single source of truth for cyclomatic complexity) outweigh the inherent risks of a major rewrite. The code quality is high and the approach is sound.

Minor suggestions for future consideration:

Consider adding performance benchmarks to quantify the unified walk benefits
The 793-line visitor could potentially be broken down further for easier maintenance

carlos-alm · 2026-03-10T07:05:11Z

Addressed both new review comments in 7418b6c:

1. cfgResults[0] returns wrong function (cfg.js:63): Replaced with .find() using node identity (result.funcNode === functionNode) so buildFunctionCFG always returns the CFG for the requested function, not the first-exited (innermost nested) one.

2. Line-number keying collision (engine.js:265): Replaced single-entry Map<line, result> with a Map<line, result[]> grouping. When multiple functions share a line, disambiguation uses function name matching (funcNode.childForFieldName('name').text === def.name). Applied to both complexity and CFG result matching for consistency.

@greptileai

greptile-apps · 2026-03-10T07:11:38Z

src/cfg.js

@@ -931,16 +212,13 @@ export async function buildCFGData(db, fileSymbols, rootDir, _engineOpts) {
        const row = getNodeId.get(def.name, relPath, def.line);
        if (!row) continue;

-        // Native path: use pre-computed CFG from Rust engine
+        // Use pre-computed CFG (native engine or unified walk), then visitor fallback
        let cfg = null;
        if (def.cfg?.blocks?.length) {
          cfg = def.cfg;
-        } else {
-          // WASM fallback: compute CFG from tree-sitter AST
-          if (!tree || !complexityRules) continue;
-          const funcNode = findFunctionNode(tree.rootNode, def.line, def.endLine, complexityRules);
-          if (!funcNode) continue;
-          cfg = buildFunctionCFG(funcNode, langId);
+        } else if (visitorCfgByLine) {
+          const r = visitorCfgByLine.get(def.line);
+          if (r) cfg = { blocks: r.blocks, edges: r.edges };


Line-number collision in visitorCfgByLine — no disambiguation

visitorCfgByLine.set(r.funcNode.startPosition.row + 1, r) uses a plain .set() that silently overwrites any prior entry at the same line. For a file containing two functions declared on the same source line (e.g. const f = () => {}, g = () => {};), whichever result is stored last wins, and the other function will receive no CFG data in the WASM fallback path.

engine.js (lines 272–282) was updated in this very PR to address the same pattern: it now accumulates an array per line and disambiguates by function name. The buildCFGData fallback was not given the same treatment.

Consider applying the same array-plus-name-disambiguation approach here:

visitorCfgByLine = new Map(); for (const r of cfgResults) { if (r.funcNode) { const line = r.funcNode.startPosition.row + 1; if (!visitorCfgByLine.has(line)) visitorCfgByLine.set(line, []); visitorCfgByLine.get(line).push(r); } }

Then at the lookup site (line 220):

const candidates = visitorCfgByLine?.get(def.line); const r = !candidates ? undefined : candidates.length === 1 ? candidates[0] : (candidates.find((c) => { const n = c.funcNode.childForFieldName('name'); return n && n.text === def.name; }) ?? candidates[0]); if (r) cfg = { blocks: r.blocks, edges: r.edges };

carlos-alm force-pushed the feat/cfg-visitor-rewrite branch from 18be36a to 27a304d Compare March 10, 2026 03:05

greptile-apps bot reviewed Mar 10, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: CFG visitor rewrite (Phase 3.1 completion)#392

feat: CFG visitor rewrite (Phase 3.1 completion)#392
carlos-alm wants to merge 3 commits intomainfrom
feat/cfg-visitor-rewrite

carlos-alm commented Mar 10, 2026

Uh oh!

claude bot commented Mar 10, 2026 •

edited

Loading

Uh oh!

claude bot commented Mar 10, 2026 •

edited

Loading

Uh oh!

greptile-apps bot commented Mar 10, 2026 •

edited

Loading

Comments Outside Diff (1)

Uh oh!

greptile-apps bot Mar 10, 2026

Uh oh!

greptile-apps bot Mar 10, 2026

Uh oh!

claude bot commented Mar 10, 2026 •

edited

Loading

Uh oh!

carlos-alm commented Mar 10, 2026

Uh oh!

greptile-apps bot Mar 10, 2026

Uh oh!

greptile-apps bot Mar 10, 2026

Uh oh!

claude bot commented Mar 10, 2026 •

edited

Loading

Uh oh!

carlos-alm commented Mar 10, 2026

Uh oh!

greptile-apps bot Mar 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

		const r = cfgResults[0];
		return { blocks: r.blocks, edges: r.edges, cyclomatic: r.cyclomatic };

Conversation

carlos-alm commented Mar 10, 2026

Summary

Test plan

Uh oh!

claude bot commented Mar 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Review: CFG Visitor Rewrite Analysis

Review Checklist

Root Cause Analysis

Critical Concerns (if any)

✅ Configuration Changes: CLEAN

✅ Test Integrity: ROBUST

✅ Code Quality: SOLID ARCHITECTURE

Final Assessment

Uh oh!

claude bot commented Mar 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Review: CFG Visitor Rewrite Analysis

Review Checklist

Root Cause Analysis

Critical Concerns (if any)

✅ Configuration Changes: CLEAN

✅ Test Integrity: ROBUST

✅ Code Quality: SOLID ARCHITECTURE

Final Assessment

Uh oh!

greptile-apps bot commented Mar 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 3/5

Important Files Changed

Sequence Diagram

Comments Outside Diff (1)

Uh oh!

greptile-apps bot Mar 10, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Mar 10, 2026

Choose a reason for hiding this comment

Uh oh!

claude bot commented Mar 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Review: CFG Visitor Rewrite Analysis

Review Checklist

Root Cause Analysis

Critical Concerns (if any)

✅ Configuration Changes: CLEAN

✅ Test Integrity: ROBUST

✅ Code Quality: SOLID ARCHITECTURE

✅ Greptile Issues: RESOLVED

Final Assessment

Uh oh!

carlos-alm commented Mar 10, 2026

Uh oh!

greptile-apps bot Mar 10, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Mar 10, 2026

Choose a reason for hiding this comment

Uh oh!

claude bot commented Mar 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Review: CFG Visitor Rewrite Analysis

Review Checklist

Root Cause Analysis

Critical Concerns (if any)

✅ Configuration Changes: CLEAN

✅ Test Integrity: ROBUST

✅ Code Quality: SOLID ARCHITECTURE

✅ Greptile Issues: RESOLVED

Final Assessment

Uh oh!

carlos-alm commented Mar 10, 2026

Uh oh!

greptile-apps bot Mar 10, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

claude bot commented Mar 10, 2026 •

edited

Loading

claude bot commented Mar 10, 2026 •

edited

Loading

greptile-apps bot commented Mar 10, 2026 •

edited

Loading

claude bot commented Mar 10, 2026 •

edited

Loading

claude bot commented Mar 10, 2026 •

edited

Loading