fix: Auto-generate leaderboard after benchmark — wire VibeReporter in…#28
Merged
fix: Auto-generate leaderboard after benchmark — wire VibeReporter in…#28
Conversation
…to save_report() (#22) ## Summary Fixes #22 — the leaderboard markdown was never generated automatically after a benchmark run. Users had to manually run `core/reporter.py` as a separate step, causing results and leaderboard to go out of sync. ## Changes **`vibebench.py`** - `from core.reporter import VibeReporter` moved to top-level imports - `save_report()` now calls `VibeReporter(report_name).generate_markdown()` immediately after saving the JSON file - Wrapped in `try/except` so a reporter failure never crashes the benchmark — a warning is printed with the manual fallback command instead ## Before / After **Before:** ``` ✅ Benchmark Complete. Report saved: vibebench_multimodel_20260319_1234.json # (user must manually run: python core/reporter.py) ``` **After:** ``` ✅ Benchmark Complete. Report saved: vibebench_multimodel_20260319_1234.json ✅ Professional Leaderboard generated: VibeBench_Leaderboard.md ``` ## Notes This PR is stacked on #24 (schema fixes) and #23 (--verbose flag). All three fixes are included in this `vibebench.py`. Closes #22
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
…to save_report() (#22)
Summary
Fixes #22 — the leaderboard markdown was never generated automatically after a benchmark run. Users had to manually run
core/reporter.pyas a separate step, causing results and leaderboard to go out of sync.Changes
vibebench.pyfrom core.reporter import VibeReportermoved to top-level importssave_report()now callsVibeReporter(report_name).generate_markdown()immediately after saving the JSON filetry/exceptso a reporter failure never crashes the benchmark — a warning is printed with the manual fallback command insteadBefore / After
Before:
After:
Notes
This PR is stacked on #24 (schema fixes) and #23 (--verbose flag). All three fixes are included in this
vibebench.py.Closes #22