Skip to content

fix: strengthen answer analysis in nfr-design and infrastructure-design#155

Open
Kalindi-Dev wants to merge 14 commits intomainfrom
fix/aidlc-rules-recommendation
Open

fix: strengthen answer analysis in nfr-design and infrastructure-design#155
Kalindi-Dev wants to merge 14 commits intomainfrom
fix/aidlc-rules-recommendation

Conversation

@Kalindi-Dev
Copy link
Copy Markdown
Contributor

@Kalindi-Dev Kalindi-Dev commented Mar 30, 2026

Summary

TEST_PR_DO_NOT_MERGE
The overconfidence prevention fix on main updated the Step 3 question-generation directives in nfr-design.md and infrastructure-design.md but left their Step 5 answer analysis with the original weak language. This PR completes the fix by aligning Step 5 with the mandatory ambiguity detection pattern already used in functional-design.md and nfr-requirements.md.

Changes

Before (both files, Step 5):

- Wait for user to complete all [Answer]: tags
- Review for vague or ambiguous responses
- Add follow-up questions if needed

After (matches functional-design.md and nfr-requirements.md):

- Wait for user to complete all [Answer]: tags
- **MANDATORY**: Carefully review ALL responses for vague or ambiguous answers
- **CRITICAL**: Add follow-up questions for ANY unclear responses - do not proceed with ambiguity
- Look for responses like "depends", "maybe", "not sure", "mix of", "somewhere between"
- Create clarification questions file if ANY ambiguities are detected
- **Do not proceed until ALL ambiguities are resolved**

Why this matters

Without this fix, the LLM asks thorough questions (Step 3 is fixed) but then accepts vague answers without pushback (Step 5 is weak). The overconfidence-prevention.md guide explicitly lists "Proceeding with vague or ambiguous user responses" as a red flag and mandates: "Don't proceed until ALL unclear responses are clarified."

Files changed

File What changed
aidlc-rules/aws-aidlc-rule-details/construction/nfr-design.md Step 5 answer analysis strengthened
aidlc-rules/aws-aidlc-rule-details/construction/infrastructure-design.md Step 5 answer analysis strengthened

Test plan

  • Verify Step 5 in both files matches the pattern in functional-design.md (line 59-65) and nfr-requirements.md (line 46-52)
  • Confirm no other construction stage files have the weak Step 5 pattern

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of the project license.

scottschreckengaust and others added 7 commits March 30, 2026 14:45
PR builds now require the 'codebuild' label and changes under
aidlc-rules/ to trigger. Push to main, tags, and workflow_dispatch
remain unconditional.
…ment

Uses the `attribution.pr` setting so Claude Code automatically appends
the required contributor statement to all PR descriptions. Adds a
gitignore negation for .claude/settings.json so shared project settings
are committed while other .claude/ files remain ignored.
- Add .claude/settings.json to repo tree diagram
- Update Pipeline 2 mermaid diagram with PR label-gate flow
- Update CodeBuild workflow triggers table and add label gate detail
- Add label-gated CI row to Security Posture table
Three stage files (nfr-design, infrastructure-design, units-generation)
still contained the "Skip entire categories if not applicable" directive
that overconfidence-prevention.md identified as the root cause of
insufficient question-asking. The fix was applied to functional-design
and nfr-requirements but missed these three files, creating contradictory
instructions within the same workflow.

Aligns all three files with the corrected approach: default to asking
questions when there is any ambiguity, evaluate all question categories,
and strengthen answer analysis to catch vague responses.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@Kalindi-Dev Kalindi-Dev requested review from a team as code owners March 30, 2026 19:53
@Kalindi-Dev Kalindi-Dev changed the title Fix/aidlc rules recommendation fix: complete overconfidence prevention fix in remaining rule files Mar 30, 2026
The overconfidence prevention fix was independently applied on main
with stronger wording (added MANDATORY paragraph). Accept main's
versions for all three rule files and the admin guide updates
(label-reminder/label-cleanup job docs).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@Kalindi-Dev Kalindi-Dev changed the title fix: complete overconfidence prevention fix in remaining rule files fix: strengthen answer analysis in nfr-design and infrastructure-design Mar 30, 2026
@scottschreckengaust scottschreckengaust added the codebuild A label to signal a request for the "CodeBuild" workflow label Mar 30, 2026
@scottschreckengaust scottschreckengaust added rules and removed codebuild A label to signal a request for the "CodeBuild" workflow labels Mar 31, 2026
@awslabs awslabs deleted a comment from github-actions bot Apr 2, 2026
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 2, 2026

A. Executive Summary

Latest release: PR #155

High-level snapshot comparing the latest release against the golden baseline (the reference evaluation used as the quality target).

Metric What it measures
Unit tests passed Number of generated unit tests that pass. Higher means the rules produce broader, more complete test suites.
Contract tests API compliance checks against the OpenAPI spec (passed/total). 88/88 = full compliance.
Lint findings Static analysis warnings in generated code. Lower is better — 0 means clean code.
Qualitative score AI-graded quality of generated documentation on a 0–1 scale (higher is better).
Execution time Wall-clock time for the full evaluation run. Lower means faster generation.
Total tokens Total LLM tokens consumed (input + output). Lower means more cost-efficient.
Metric Golden Latest (PR #155) vs Golden Trend
Unit tests passed 180 0 -180 █▆▆▄▅▅▁
Contract tests 88/88 0/88 -88 ███▇██▁
Lint findings 0 1 ▁▁▁▁▁▁█
Qualitative score 0.854 0.877 +0.022 ▁▆▆▂▆█▄
Execution time 23.8m 34.9m +11.1m ▁▂▁▂▁▁█
Total tokens 18.39M 0 -18.39M ▅▇▅▆▆█▁

Full trend report available in the workflow artifacts.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

do-not-merge Halts a pull request from merging rules

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants