Skip to content

feat(cpp): performance benchmarks and binary size tracking#519

Open
itomek wants to merge 5 commits intomainfrom
358-c-framework-performance-benchmarks-and-binary-size-tracking
Open

feat(cpp): performance benchmarks and binary size tracking#519
itomek wants to merge 5 commits intomainfrom
358-c-framework-performance-benchmarks-and-binary-size-tracking

Conversation

@itomek
Copy link
Collaborator

@itomek itomek commented Mar 16, 2026

Closes #358

Summary

  • Adds benchmark_cpp.yml workflow measuring binary size, startup time, loop latency, and memory footprint across Linux and Windows
  • Baselines are cached on pushes to main; every PR run compares current results against the last baseline with per-metric thresholds (10% for binary size, 15% for everything else)
  • Regression report is printed inline to the CI log; results uploaded as downloadable artifacts

Fixes applied (from CI code review)

  • Cache path mismatch — restore path now matches save path; renamed to benchmark-baseline.json before the fresh run overwrites benchmark-results.json
  • CI triggerbenchmark_cpp.yml and build_cpp.yml added to both push and pull_request paths in build_cpp.yml
  • Windows header order<windows.h> before <psapi.h> in bench_utils.h
  • Benchmark job timeouttimeout-minutes: 15 added
  • Double-init guardinitCalled_ bool prevents duplicate tool registration in BenchAgent
  • Missing metric detectioncompareAndReport now flags baseline metrics absent from current run as MISSING (catches crashed benchmarks that previously produced a false-positive clean pass)
  • Timer guardelapsedUs() returns 0 if stop() was never called
  • CMakeLists option orderGAIA_BUILD_BENCHMARKS moved to top-level options block alongside other GAIA_BUILD_* options
  • .gitignorecpp/benchmark-*.json excluded (ephemeral CI artifacts)

Test plan

  • Built locally (clean, no new warnings)
  • Ran benchmarks end-to-end — JSON output verified
  • Ran compare mode — regression table printed, exit 1 on detected regression
  • 216/217 unit tests pass (1 pre-existing failure unrelated to this PR — LemonadeClientTest.DefaultConstruction fails due to GAIA_CPP_BASE_URL env var set in local environment)
  • Lint passed (uv run python util/lint.py --all)

- Add benchmark_cpp.yml workflow: binary size, startup time, loop latency, memory footprint
- Cache baseline on main pushes; compare on every PR with per-metric thresholds (10% binary, 15% other)
- Fix cache path mismatch: restore and rename before fresh run overwrites results
- Fix build_cpp.yml push/PR paths to include both workflow files
- Add timeout-minutes: 15 to benchmark job
- Add initCalled_ guard in BenchAgent to prevent duplicate tool registration
- Add MISSING metric detection in compareAndReport (catches crashed benchmarks)
- Add Timer::elapsedUs() guard against unmatched stop()
- Move GAIA_BUILD_BENCHMARKS option to top-level options block in CMakeLists.txt
- Ignore cpp/benchmark-*.json in .gitignore (ephemeral CI artifacts)
@itomek itomek linked an issue Mar 16, 2026 that may be closed by this pull request
@github-actions github-actions bot added devops DevOps/infrastructure changes cpp labels Mar 16, 2026
@itomek itomek self-assigned this Mar 16, 2026
Adds a reusable /finalize-implementation skill that runs tests, lint,
CI review simulation, and sub-agent code/architecture reviews in a loop
(max 5 iterations), then commits and creates a draft PR.

Co-Authored-By: Tomasz Waszczyk <tomasz@waszczyk.com>
@itomek itomek marked this pull request as ready for review March 18, 2026 17:14
@itomek itomek requested a review from kovtcharov-amd as a code owner March 18, 2026 17:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cpp devops DevOps/infrastructure changes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

C++ Framework: Performance benchmarks and binary size tracking

2 participants