perf: custom loader (up to 70x+ improvements) by cmilesdev · Pull Request #3 · goforj/wire

cmilesdev · 2026-03-16T02:11:03Z

This PR overhauls the loader process for Wire which has significant performance issues under various circumstances. Most of which were tied to go/packages semantics and go/types type checking work. With larger codebases, you incur a growing cost of both as a baseline. For example, in a project that stock Google Wire would take 1.3s to run, 400-600ms of that was go list deps and the rest was type checking all external and internal packages recursively in order to do what Wire needs to do. This PR replaces the loader process, caches the heavy work and invalidates it when necessary.

Safety

Safety, accuracy is important above all else. Existing Wire tests pass along with a plethora of newly added scenario based testing.

Test it Yourself

You can test the new loader yourself by installing this branch:

go install github.com/goforj/wire/cmd/wire@cf52879

Revert back to goforj

go install github.com/goforj/wire/cmd/wire@latest

Or to stock google wire

go install github.com/google/wire/cmd/wire@latest

Custom Loader

This custom loader implementation is far more consistent than previous cache attempts (on goforj/wire) where those cache attempts were tackling the issue further down the line when they needed to be addressed more upstream in the loader. While this was helpful, you'd lose your compile speed under certain edit types. This implementation provides consistent lightning fast compile times across the board in many scenarios. See table below for measurements.

The table below illustrates up to 70x+ speed improvements, but these improvements scale with codebase size and complexity so you can maintain a great development experience even under massive repositories.

Benchmarks

+---------------+-------+--------+----------+----------------------+----------+----------+---------+
|       profile | local | stdlib | external | change type          | stock    | current  | speedup |
+---------------+-------+--------+----------+----------------------+----------+----------+---------+
|         local | 41    | 191    | 1        | cold run             | 350.6ms  | 2976.0ms | 0.12x   |
|         local | 41    | 191    | 1        | unchanged rerun      | 339.6ms  | 9.4ms    | 36.18x  |
|         local | 41    | 191    | 1        | body-only local edit | 328.5ms  | 27.2ms   | 12.09x  |
|         local | 41    | 191    | 1        | shape change         | 325.9ms  | 146.7ms  | 2.22x   |
|         local | 41    | 191    | 1        | import change        | 327.6ms  | 147.6ms  | 2.22x   |
|         local | 41    | 191    | 1        | known import toggle  | 327.9ms  | 145.2ms  | 2.26x   |
|    local-high | 1016  | 191    | 1        | cold run             | 759.4ms  | 5421.0ms | 0.14x   |
|    local-high | 1016  | 191    | 1        | unchanged rerun      | 605.5ms  | 78.6ms   | 7.71x   |
|    local-high | 1016  | 191    | 1        | body-only local edit | 604.8ms  | 150.4ms  | 4.02x   |
|    local-high | 1016  | 191    | 1        | shape change         | 601.8ms  | 674.6ms  | 0.89x   |
|    local-high | 1016  | 191    | 1        | import change        | 602.4ms  | 688.7ms  | 0.87x   |
|    local-high | 1016  | 191    | 1        | known import toggle  | 601.7ms  | 675.6ms  | 0.89x   |
|  external-low | 42    | 243    | 342      | cold run             | 1490.8ms | 7499.4ms | 0.20x   |
|  external-low | 42    | 243    | 342      | unchanged rerun      | 1198.1ms | 16.2ms   | 73.98x  |
|  external-low | 42    | 243    | 342      | body-only local edit | 1113.6ms | 81.4ms   | 13.68x  |
|  external-low | 42    | 243    | 342      | shape change         | 1208.2ms | 405.4ms  | 2.98x   |
|  external-low | 42    | 243    | 342      | import change        | 1186.3ms | 421.1ms  | 2.82x   |
|  external-low | 42    | 243    | 342      | known import toggle  | 1056.9ms | 431.2ms  | 2.45x   |
| external-high | 117   | 243    | 342      | cold run             | 1448.5ms | 7643.8ms | 0.19x   |
| external-high | 117   | 243    | 342      | unchanged rerun      | 1132.0ms | 22.5ms   | 50.37x  |
| external-high | 117   | 243    | 342      | body-only local edit | 1167.2ms | 91.6ms   | 12.75x  |
| external-high | 117   | 243    | 342      | shape change         | 1224.0ms | 483.1ms  | 2.53x   |
| external-high | 117   | 243    | 342      | import change        | 1286.2ms | 467.1ms  | 2.75x   |
| external-high | 117   | 243    | 342      | known import toggle  | 1268.8ms | 468.5ms  | 2.71x   |
+---------------+-------+--------+----------+----------------------+----------+----------+---------+

Stock: is google/wire, not goforj/wire
cold run: first wire gen
unchanged rerun: run wire gen again without changing any files.
body-only local edit: change only function body/content in a local Go file, without changing imports, types, or constructor signatures.
shape change: change local type/provider shape, like constructor params, fields, or return shape, while staying within the same general dependency graph.
import change: add or remove an import in a local package, which can change the discovered package graph and cached shape.
known import toggle: switch back to a previously seen import/shape state in the same repo, so the loader can potentially reuse an already-known cached graph.

Implementation Details

This PR replaces the primary go/packages path with a custom loader that is much more intentional about what work it does and when it does it.

Discover the package graph for the requested roots
Load typed package state from that discovered graph
Cache both discovery state and typed artifacts aggressively
Fall back cleanly when the custom path cannot safely satisfy the request

There are two different kinds of work:

Discovery work tells us what packages, files, and imports are involved
Typed work is the expensive parse, typecheck, and package stitching work

By separating those layers we can skip far more repeated work than before.

Loader

Most of the new implementation lives in internal/loader.

custom.go is the main custom loading backend
discovery.go and discovery_cache.go build and reuse the discovered graph
artifact_cache.go stores typed package artifacts for reuse between runs
fallback.go preserves a safe fallback path when needed
timing.go exposes timings so we can see where time is actually going

Caching

Caching is now a first class part of generation instead of a thin optimization layered on afterward.

Discovery cache stores the discovered package graph
Loader artifacts store typed package summaries for local and external packages
Cache continues to support parser level reuse
Output cache remains available for generated output reuse

This is what enables the very fast unchanged rerun and body-only local edit paths in the benchmark table.

External packages are heavily reused after the first run
Local packages only pay for the work required by the current edit shape
Previously seen states can often reuse old cache state instead of rediscovering everything

Parser / Wire Integration

The parser and provider set handling needed to be updated to operate cleanly with cached semantic state.

internal/wire/parse.go now reconstructs provider information from semantic artifacts when safe
internal/wire/output_cache.go was updated around the new loader flow
internal/wire/wire.go now drives generation through the new loading backend
internal/wire/load_debug.go and internal/wire/loader_timing_bridge.go make the new path observable via -timings flag

One important detail here is that the system still prefers falling back cleanly over trusting cached reconstruction when it is not safe to do so.

CLI

The command surface was updated to route through the same generation model instead of splitting behavior between commands.

wire gen, wire check, wire diff, and wire show now run through the new backend
wire watch stays on the same core generation path rather than inventing a separate implementation
wire cache was expanded so cache inspection and clearing are easier
cmd/wire/main.go now wires cache and loader behavior more explicitly

Colorization There are now new colorization for errors (red) and success (green) to make things a little easier to read in between watcher tooling spam. Multiline errors are now presented more user friendly.

Compatibility / Safety

This is meant to be a conservative loader change, not a semantic rewrite of Wire. Everything works the same outside of some of the tweaks goforj/wire has made by introducing wire cache clear wire serve

Existing generation entrypoints stay intact
Fallback behavior exists when the custom path cannot safely proceed
Test coverage was added heavily around loader behavior, parser behavior, and command integration

That shows up in:

internal/loader/loader_test.go
internal/wire/wire_test.go
internal/wire/parse_coverage_test.go
cmd/wire/main_test.go

Benchmarks

The benchmark harness was also expanded so it measures concrete developer workflows instead of only raw repo scale. (Seen in the table above)

scripts/import-benchmarks.sh now prints both scale and scenario tables
internal/wire/import_bench_test.go now measures edit types like unchanged reruns, body edits, shape edits, and import toggles
scenario runs distinguish cold runs from warmed runs
benchmark output now reports real graph composition:
- local packages
- stdlib packages
- external packages

There are now a few benchmark profiles:

local for a modest local graph
local-high for a very large local graph
external for a graph with a much heavier external dependency surface

…rden fallback behavior

…es fast

zzzz465 · 2026-03-27T10:08:56Z

I tried this version and it works fine in local environment, decreased wiring time from 20s to roughly 1s.
however, there's some issue that block using this from CI environment.

mtime is changed when cache is stored/loaded in CI environment, cache is always considered stale.
discovery-cache is hardcoded to UserCacheDir(). this can be resolved by caching whole cache dir, but it is good to have a granular control over cache directory.

cmilesdev added 30 commits March 13, 2026 21:20

feat: incremental loading

e6a9903

feat(incremental): reuse unchanged local packages in fast path and ha…

e3f07cb

…rden fallback behavior

perf(incremental): trim cold bootstrap work and keep warm shape chang…

2eb5400

…es fast

perf(incremental): load deps conditionally

ad2d561

chore(incremental): clear session cache

578b24f

fix(cli): improve wire error coloring and solve error labeling

83806b9

feat(incremental): harden loader and scenario tooling

c6e4f4e

feat: custom loader initial

d3486f5

feat: external loader caching

e7cfc63

chore: remove local caching strat

9379659

feat: local caching from wire perspective

3bc75c4

feat: go dep cache

a8c2a02

feat(loader): cache unchanged root output

26f591b

chore: bench tweaks

a357a38

chore: re-implement cache

d6b36b1

fix: provider discovery

84087dd

chore: benchmark update

e968dcc

fix: ci

ba95466

fix: ci

4b312b5

fix: windows tmpdir issue

433ffc7

fix: windows bench executable path

d941179

fix(loader): strengthen artifact keys for replaced external modules

88c8634

test(loader): harden cache invalidation and discovery parity coverage

09073ee

fix(loader): treat replaced workspace deps as local and harden runtests

fea5e0a

fix(loader): make cache-hardening tests and runtests portable

568d2e0

fix(loader): use valid file GOPROXY URLs in proxy-based tests

114d174

fix(loader): format file GOPROXY URLs correctly on windows

53890d7

fix(loader): normalize test path comparisons across platforms

efa0214

refactor: remove unused loader and wire helpers

541acdf

refactor: dedupe command and custom loader helpers

40ab144

cmilesdev added 28 commits March 16, 2026 19:19

refactor: share named struct type resolution

f9d735f

refactor: share semantic type name lookup

4477c75

refactor: share semantic package member lookup

2c0446d

refactor: share semantic error wrapping

848371f

refactor: share provider set finalization

3944cbb

refactor: add isolated output cache gate

12858a2

refactor: make provider set fallback policy explicit

8dbd590

refactor: share custom loader root loading path

9ef6b11

refactor: share custom loader metadata root graph

4386089

refactor: isolate semantic provider set support rule

b099db8

refactor: fold back weak cleanup abstractions

cf4bb80

refactor: narrow loader semantic artifact coupling

c323a0f

refactor: unify semantic provider set support rules

7bf31e8

fix: restore local loader artifact safety gate

a7fc49c

refactor: disable semantic reconstruction by default

df1f6f6

refactor: remove semantic reconstruction path

3927014

refactor: remove semantic cache layer

4af9edc

refactor: add import benchmark profile filter

bf14bb5

refactor: trim redundant discovery cache metadata

bf4a02d

refactor: remove redundant discovery cache cloning

788798d

refactor: split external benchmark profiles

0921fc2

refactor: share custom typed load pipeline

e52201b

refactor: centralize custom metadata loading

2dc4ac4

refactor: share loader fallback reason policy

6f09664

refactor: add targeted local profile benchmark

d003c9f

refactor: add one-shot import profile harness

a112565

perf: reuse root discovery for generate loads

ee3ffc9

style: format loader discovery changes

cf52879

cmilesdev mentioned this pull request Mar 18, 2026

support passing receiver method as provider #4

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf: custom loader (up to 70x+ improvements)#3

perf: custom loader (up to 70x+ improvements)#3
cmilesdev wants to merge 79 commits intomainfrom
cmilesdev/custom-loader

cmilesdev commented Mar 16, 2026 •

edited

Loading

Uh oh!

zzzz465 commented Mar 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

cmilesdev commented Mar 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Implementation Details

Uh oh!

zzzz465 commented Mar 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

cmilesdev commented Mar 16, 2026 •

edited

Loading