Skip to content

fix: scope scoring bug in prompt_score + add 10 unit tests#129

Open
TerminalGravity wants to merge 2 commits intomainfrom
fix/prompt-score-tests-and-scope-bug
Open

fix: scope scoring bug in prompt_score + add 10 unit tests#129
TerminalGravity wants to merge 2 commits intomainfrom
fix/prompt-score-tests-and-scope-bug

Conversation

@TerminalGravity
Copy link
Collaborator

What

  • Bug fix: scorePrompt gave 25/25 scope to any prompt >100 chars, regardless of whether scope was actually bounded. A rambling vague prompt got perfect scope just for being long.
  • Fix: Long prompts now get 20/25 with feedback suggesting explicit scope bounds (only, just, this).
  • Tests: 10 new unit tests for scorePrompt covering all 4 scoring dimensions, grade boundaries, edge cases.
  • Export: scorePrompt is now exported for direct testing.

Tests

All 53 tests pass (10 new + 43 existing).

- Fix: long prompts no longer auto-score 25/25 on scope (was rewarding
  verbosity regardless of whether scope was actually bounded)
- Long prompts now get 20/25 with feedback to add explicit bounds
- Export scorePrompt for testability
- Add 10 tests covering all scoring dimensions, grade boundaries, and
  edge cases (vague prompts, questions, broad vs bounded scope)
Tests cover:
- Specificity scoring (file paths, backtick identifiers, generic mentions)
- Scope scoring (bounded vs broad keywords)
- Actionability scoring (specific vs vague verbs)
- Done condition scoring (outcome words, questions)
- Grade assignment (A+ threshold)
- Feedback messages (praise for perfect prompts)
- Score consistency (total = sum of sub-scores)
Copy link
Collaborator Author

@TerminalGravity TerminalGravity left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch on the scope scoring — giving 25/25 just for prompt length was wrong. Fix is clean, test coverage is solid. One thought: might be worth adding a test for a prompt that's long and has explicit scope bounds to confirm those still get full marks. Not blocking though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant