Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions docs/roadmap/BACKLOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -144,6 +144,8 @@ These address fundamental limitations in the parsing and resolution pipeline tha
| 71 | Basic type inference for typed languages | Extract type annotations from TypeScript and Java AST nodes (variable declarations, function parameters, return types, generics) to resolve method calls through typed references. Currently `const x: Router = express.Router(); x.get(...)` produces no edge because `x.get` can't be resolved without knowing `x` is a `Router`. Tree-sitter already parses type annotations — we just don't use them for resolution. Start with declared types (no flow inference), which covers the majority of TS/Java code. | Resolution | Dramatically improves call graph completeness for TypeScript and Java — the two languages where developers annotate types explicitly and expect tooling to use them. Directly prevents hallucinated "no callers" results for methods called through typed variables | ✓ | ✓ | 5 | No | — |
| 72 | Interprocedural dataflow analysis | Extend the existing intraprocedural dataflow (ID 14) to propagate `flows_to`/`returns`/`mutates` edges across function boundaries. When function A calls B with argument X, and B's dataflow shows X flows to its return value, connect A's call site to the downstream consumers of B's return. Requires stitching per-function dataflow summaries at call edges — no new parsing, just graph traversal over existing `dataflow` + `edges` tables. Start with single-level propagation (caller↔callee), not transitive closure. | Analysis | Current dataflow stops at function boundaries, missing the most important flows — data passing through helper functions, middleware chains, and factory patterns. Single-function scope means `dataflow` can't answer "where does this user input end up?" across call boundaries. Cross-function propagation is the difference between toy dataflow and useful taint-like analysis | ✓ | ✓ | 5 | No | 14 |
| 73 | Improved dynamic call resolution | Upgrade the current "best-effort" dynamic dispatch resolution for Python, Ruby, and JavaScript. Three concrete improvements: **(a)** receiver-type tracking — when `x = SomeClass()` is followed by `x.method()`, resolve `method` to `SomeClass.method` using the assignment chain (leverages existing `ast_nodes` + `dataflow` tables); **(b)** common pattern recognition — resolve `EventEmitter.on('event', handler)` callback registration, `Promise.then/catch` chains, `Array.map/filter/reduce` with named function arguments, and decorator/annotation patterns; **(c)** confidence-tiered edges — mark dynamically-resolved edges with a confidence score (high for direct assignment, medium for pattern match, low for heuristic) so consumers can filter by reliability. | Resolution | In Python/Ruby/JS, 30-60% of real calls go through dynamic dispatch — method calls on variables, callbacks, event handlers, higher-order functions. The current best-effort resolution misses most of these, leaving massive gaps in the call graph for the languages where codegraph is most commonly used. Even partial improvement here has outsized impact on graph completeness | ✓ | ✓ | 5 | No | — |
| 81 | Track dynamic `import()` and re-exports as graph edges | Extract `import()` expressions as `dynamic-imports` edges in both WASM extraction paths (query-based and walk-based). Destructured names (`const { a } = await import(...)`) feed into `importedNames` for call resolution. **Partially done:** WASM JS/TS extraction works (PR #389). Remaining: **(a)** native Rust engine support — `crates/codegraph-core/src/extractors/javascript.rs` doesn't extract `import()` calls; **(b)** non-static paths (`import(\`./plugins/${name}.js\`)`, `import(variable)`) are skipped with a debug warning; **(c)** re-export consumer counting in `exports --unused` only checks `calls` edges, not `imports`/`dynamic-imports` — symbols consumed only via import edges show as zero-consumer false positives. | Resolution | Fixes false "zero consumers" reports for symbols consumed via dynamic imports. 95 `dynamic-imports` edges found in codegraph's own codebase — these were previously invisible to impact analysis, exports audit, and dead-export hooks | ✓ | ✓ | 5 | No | — |
| 82 | Extract names from `import().then()` callback patterns | `extractDynamicImportNames` only extracts destructured names from `const { a } = await import(...)` (walks up to `variable_declarator`). The `.then()` pattern — `import('./foo.js').then(({ a, b }) => ...)` — produces an edge with empty names because the destructured parameters live in the `.then()` callback, not a `variable_declarator`. Detect when an `import()` call's parent is a `member_expression` with `.then`, find the arrow/function callback in `.then()`'s arguments, and extract parameter names from its destructuring pattern. | Resolution | `.then()`-style dynamic imports are common in older codebases and lazy-loading patterns (React.lazy, Webpack code splitting). Without name extraction, these produce file-level edges only — no symbol-level `calls` edges, so the imported symbols still appear as zero-consumer false positives | ✓ | ✓ | 4 | No | 81 |

### Tier 1i — Search, navigation, and monitoring improvements

Expand Down
14 changes: 12 additions & 2 deletions src/builder.js
Original file line number Diff line number Diff line change
Expand Up @@ -1041,7 +1041,13 @@ export async function buildGraph(rootDir, opts = {}) {
const resolvedPath = getResolved(path.join(rootDir, relPath), imp.source);
const targetRow = getNodeId.get(resolvedPath, 'file', resolvedPath, 0);
if (targetRow) {
const edgeKind = imp.reexport ? 'reexports' : imp.typeOnly ? 'imports-type' : 'imports';
const edgeKind = imp.reexport
? 'reexports'
: imp.typeOnly
? 'imports-type'
: imp.dynamicImport
? 'dynamic-imports'
: 'imports';
allEdgeRows.push([fileNodeId, targetRow.id, edgeKind, 1.0, 0]);

if (!imp.reexport && isBarrelFile(resolvedPath)) {
Expand All @@ -1060,7 +1066,11 @@ export async function buildGraph(rootDir, opts = {}) {
allEdgeRows.push([
fileNodeId,
actualRow.id,
edgeKind === 'imports-type' ? 'imports-type' : 'imports',
edgeKind === 'imports-type'
? 'imports-type'
: edgeKind === 'dynamic-imports'
? 'dynamic-imports'
: 'imports',
0.9,
0,
]);
Expand Down
135 changes: 130 additions & 5 deletions src/extractors/javascript.js
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
import { debug } from '../logger.js';
import { findChild, nodeEndLine } from './helpers.js';

/**
Expand Down Expand Up @@ -173,6 +174,9 @@ function extractSymbolsQuery(tree, query) {
// Extract top-level constants via targeted walk (query patterns don't cover these)
extractConstantsWalk(tree.rootNode, definitions);

// Extract dynamic import() calls via targeted walk (query patterns don't match `import` function type)
extractDynamicImportsWalk(tree.rootNode, imports);

return { definitions, calls, imports, classes, exports: exps };
}

Expand Down Expand Up @@ -224,6 +228,41 @@ function extractConstantsWalk(rootNode, definitions) {
}
}

/**
* Recursive walk to find dynamic import() calls.
* Query patterns match call_expression with identifier/member_expression/subscript_expression
* functions, but import() has function type `import` which none of those patterns cover.
*/
function extractDynamicImportsWalk(node, imports) {
if (node.type === 'call_expression') {
const fn = node.childForFieldName('function');
if (fn && fn.type === 'import') {
const args = node.childForFieldName('arguments') || findChild(node, 'arguments');
if (args) {
const strArg = findChild(args, 'string');
if (strArg) {
const modPath = strArg.text.replace(/['"]/g, '');
Comment on lines +242 to +244
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Template-literal and variable import paths silently skipped

Both extractDynamicImportsWalk (query path) and the walk path in extractSymbolsWalk look for a child node of type string to extract the module path. This means the following patterns produce no edge and no warning:

import(`./plugins/${name}.js`);   // template_string node, not string
const p = './utils.js';
const mod = await import(p);      // identifier node, not string

These are admittedly hard to resolve statically, but they account for a meaningful portion of real-world dynamic imports (especially lazy-loaded plugins). Leaving them silently uncovered can lead to the same "false zero-consumers" reports the PR was designed to fix.

A simple improvement would be to log a debug-level warning when an import() call is encountered but the path cannot be statically resolved, so users are at least aware of the gap.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Already addressed in commit ae0155e — both the walk path (extractDynamicImportsWalk) and the query path (walkJavaScriptNode) already emit a debug() warning when a dynamic import() has a non-static path (template literal or variable).

const names = extractDynamicImportNames(node);
imports.push({
source: modPath,
names,
line: node.startPosition.row + 1,
dynamicImport: true,
});
} else {
debug(
`Skipping non-static dynamic import() at line ${node.startPosition.row + 1} (template literal or variable)`,
);
}
}
return; // no need to recurse into import() children
}
}
for (let i = 0; i < node.childCount; i++) {
extractDynamicImportsWalk(node.child(i), imports);
}
}

function handleCommonJSAssignment(left, right, node, imports) {
if (!left || !right) return;
const leftText = left.text;
Expand Down Expand Up @@ -455,11 +494,36 @@ function extractSymbolsWalk(tree) {
case 'call_expression': {
const fn = node.childForFieldName('function');
if (fn) {
const callInfo = extractCallInfo(fn, node);
if (callInfo) calls.push(callInfo);
if (fn.type === 'member_expression') {
const cbDef = extractCallbackDefinition(node, fn);
if (cbDef) definitions.push(cbDef);
// Dynamic import(): import('./foo.js') → extract as an import entry
if (fn.type === 'import') {
const args = node.childForFieldName('arguments') || findChild(node, 'arguments');
if (args) {
const strArg = findChild(args, 'string');
if (strArg) {
const modPath = strArg.text.replace(/['"]/g, '');
// Extract destructured names from parent context:
// const { a, b } = await import('./foo.js')
// (standalone import('./foo.js').then(...) calls produce an edge with empty names)
const names = extractDynamicImportNames(node);
imports.push({
source: modPath,
names,
line: node.startPosition.row + 1,
dynamicImport: true,
});
} else {
debug(
`Skipping non-static dynamic import() at line ${node.startPosition.row + 1} (template literal or variable)`,
);
}
}
} else {
const callInfo = extractCallInfo(fn, node);
if (callInfo) calls.push(callInfo);
if (fn.type === 'member_expression') {
const cbDef = extractCallbackDefinition(node, fn);
if (cbDef) definitions.push(cbDef);
}
}
}
break;
Expand Down Expand Up @@ -941,3 +1005,64 @@ function extractImportNames(node) {
scan(node);
return names;
}

/**
* Extract destructured names from a dynamic import() call expression.
*
* Handles:
* const { a, b } = await import('./foo.js') → ['a', 'b']
* const mod = await import('./foo.js') → ['mod']
* import('./foo.js') → [] (no names extractable)
*
* Walks up the AST from the call_expression to find the enclosing
* variable_declarator and reads the name/object_pattern.
*/
function extractDynamicImportNames(callNode) {
// Walk up: call_expression → await_expression → variable_declarator
let current = callNode.parent;
// Skip await_expression wrapper if present
if (current && current.type === 'await_expression') current = current.parent;
// We should now be at a variable_declarator (or not, if standalone import())
if (!current || current.type !== 'variable_declarator') return [];

const nameNode = current.childForFieldName('name');
if (!nameNode) return [];

// const { a, b } = await import(...) → object_pattern
if (nameNode.type === 'object_pattern') {
const names = [];
for (let i = 0; i < nameNode.childCount; i++) {
const child = nameNode.child(i);
if (child.type === 'shorthand_property_identifier_pattern') {
names.push(child.text);
} else if (child.type === 'pair_pattern') {
// { a: localName } → use localName (the alias) for the local binding,
// but use the key (original name) for import resolution
const key = child.childForFieldName('key');
if (key) names.push(key.text);
}
}
return names;
}

// const mod = await import(...) → identifier (namespace-like import)
if (nameNode.type === 'identifier') {
return [nameNode.text];
}

// const [a, b] = await import(...) → array_pattern (rare but possible)
if (nameNode.type === 'array_pattern') {
const names = [];
for (let i = 0; i < nameNode.childCount; i++) {
const child = nameNode.child(i);
if (child.type === 'identifier') names.push(child.text);
else if (child.type === 'rest_pattern') {
const inner = child.child(0) || child.childForFieldName('name');
if (inner && inner.type === 'identifier') names.push(inner.text);
}
}
return names;
}

return [];
}
1 change: 1 addition & 0 deletions src/kinds.js
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,7 @@ export const ALL_SYMBOL_KINDS = CORE_SYMBOL_KINDS;
export const CORE_EDGE_KINDS = [
'imports',
'imports-type',
'dynamic-imports',
'reexports',
'calls',
'extends',
Expand Down
11 changes: 11 additions & 0 deletions tests/engines/query-walk-parity.test.js
Original file line number Diff line number Diff line change
Expand Up @@ -46,6 +46,7 @@ function normalize(symbols) {
...(i.reexport ? { reexport: true } : {}),
...(i.wildcardReexport ? { wildcardReexport: true } : {}),
...(i.typeOnly ? { typeOnly: true } : {}),
...(i.dynamicImport ? { dynamicImport: true } : {}),
}))
.sort((a, b) => a.line - b.line),
classes: (symbols.classes || [])
Expand Down Expand Up @@ -178,6 +179,16 @@ export class Server {
fn.call(null, arg);
obj.apply(undefined, args);
method.bind(ctx);
`,
},
{
name: 'dynamic import() expressions',
file: 'test.js',
code: `
const { readFile } = await import('fs/promises');
const { readFile: rf } = await import('node:fs/promises');
const mod = await import('./utils.js');
import('./side-effect.js');
`,
},
// TypeScript-specific
Expand Down