feat(cpp): performance benchmarks and binary size tracking#519
Open
feat(cpp): performance benchmarks and binary size tracking#519
Conversation
- Add benchmark_cpp.yml workflow: binary size, startup time, loop latency, memory footprint - Cache baseline on main pushes; compare on every PR with per-metric thresholds (10% binary, 15% other) - Fix cache path mismatch: restore and rename before fresh run overwrites results - Fix build_cpp.yml push/PR paths to include both workflow files - Add timeout-minutes: 15 to benchmark job - Add initCalled_ guard in BenchAgent to prevent duplicate tool registration - Add MISSING metric detection in compareAndReport (catches crashed benchmarks) - Add Timer::elapsedUs() guard against unmatched stop() - Move GAIA_BUILD_BENCHMARKS option to top-level options block in CMakeLists.txt - Ignore cpp/benchmark-*.json in .gitignore (ephemeral CI artifacts)
Adds a reusable /finalize-implementation skill that runs tests, lint, CI review simulation, and sub-agent code/architecture reviews in a loop (max 5 iterations), then commits and creates a draft PR. Co-Authored-By: Tomasz Waszczyk <tomasz@waszczyk.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #358
Summary
benchmark_cpp.ymlworkflow measuring binary size, startup time, loop latency, and memory footprint across Linux and Windowsmain; every PR run compares current results against the last baseline with per-metric thresholds (10% for binary size, 15% for everything else)Fixes applied (from CI code review)
benchmark-baseline.jsonbefore the fresh run overwritesbenchmark-results.jsonbenchmark_cpp.ymlandbuild_cpp.ymladded to bothpushandpull_requestpaths inbuild_cpp.yml<windows.h>before<psapi.h>inbench_utils.htimeout-minutes: 15addedinitCalled_bool prevents duplicate tool registration inBenchAgentcompareAndReportnow flags baseline metrics absent from current run asMISSING(catches crashed benchmarks that previously produced a false-positive clean pass)elapsedUs()returns 0 ifstop()was never calledGAIA_BUILD_BENCHMARKSmoved to top-level options block alongside otherGAIA_BUILD_*options.gitignore—cpp/benchmark-*.jsonexcluded (ephemeral CI artifacts)Test plan
LemonadeClientTest.DefaultConstructionfails due toGAIA_CPP_BASE_URLenv var set in local environment)uv run python util/lint.py --all)