Fix to test case (problem_103) and formatting issues in sample_examples#71
Fix to test case (problem_103) and formatting issues in sample_examples#71barabbs wants to merge 3 commits intotrishullab:mainfrom
Conversation
There was a problem hiding this comment.
Pull request overview
This PR addresses two reported issues by correcting a failing/incorrect test case in the Lean human-eval suite and normalizing section-tag formatting in sample example Lean files (used by the repo’s extraction/parsing tooling).
Changes:
- Fix the expected output for the
(185, 546)test case inhuman_eval/problem_103.lean. - Add missing
generated_spec_bodysection markers insample_examples/problem_3.leanandsample_examples/problem_5.lean. - Remove an extra
bytoken insample_examples/problem_1.leanto match the surrounding sectioned-proof style.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| src/lean4/sample_examples/problem_5.lean | Splits generated_spec vs generated_spec_body with explicit section markers. |
| src/lean4/sample_examples/problem_3.lean | Splits generated_spec vs generated_spec_body with explicit section markers. |
| src/lean4/sample_examples/problem_1.lean | Removes extra by after spec_isomorphism signature to standardize formatting. |
| src/lean4/human_eval/problem_103.lean | Updates one test expected output for the average/binary conversion task. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| -- #test implementation 964 977 = some "0b1111001010" | ||
| -- #test implementation 996 997 = some "0b1111100100" | ||
| -- #test implementation 185 546 = some "0b101101110" | ||
| -- #test implementation 185 546 = some "0b101101101" |
There was a problem hiding this comment.
The updated expected output matches the current implementation/problem_spec behavior (it uses xs.sum / xs.length, i.e., integer division/floor). However, the problem docstring still says "Round the answer to the nearest integer", which conflicts with the floor-rounding used by all existing tests (e.g., (964, 977) → 970.5 → 970). Please update the docstring to reflect floor rounding (or change the spec/implementation to do nearest-integer rounding and update the tests accordingly).
Hi @amit9oct,
It's me again :)
Opening this PR to address issues #69 and #70