-
Notifications
You must be signed in to change notification settings - Fork 1
Description
Summary
Group a flat list of DiscoveredItems into DraftFeature clusters. This is the core intelligence of Phase 2. Uses file path proximity, naming conventions, and API path prefixes — no ML or LLM required.
Depends on: #124, and at least one miner (#127–#132)
New file
src/specleft/discovery/grouping.py
from specleft.discovery.models import (
DiscoveredItem, DraftFeature, DraftScenario, ItemKind,
TestFunctionMeta, ApiRouteMeta, GitCommitMeta,
)
def group_items(items: list[DiscoveredItem]) -> list[DraftFeature]: ...Grouping strategy (applied in priority order)
1. File-path grouping (primary)
Items whose file_path shares a common directory segment form an initial group. Group key = the most specific shared directory name.
tests/auth/test_login.py
tests/auth/test_logout.py → group key: "auth"
2. API path prefix grouping
Route items (kind=API_ROUTE) — use item.typed_meta() to get ApiRouteMeta and extract path. Group by first path segment.
GET /users/{id}
POST /users → group key: "users"
DELETE /users/{id}
3. Name-prefix grouping (fallback)
Items whose name shares a common prefix after stripping test_, test , it , etc.
test_payment_success
test_payment_declined → group key: "payment"
4. Git history cross-reference
Git items (kind=GIT_COMMIT) — use item.typed_meta() to get GitCommitMeta and extract file_prefixes. Merge into the existing group whose file paths overlap most. Unmatched git items form their own group only if they have >=3 commits pointing to the same prefix.
Typed metadata access
The grouping algorithm should use item.typed_meta() for type-safe access to metadata fields. This avoids dict["key"] lookups and ensures compile-time safety:
# Instead of:
path = item.metadata["path"] # KeyError risk
prefixes = item.metadata["file_prefixes"] # KeyError risk
# Use:
meta = item.typed_meta()
if isinstance(meta, ApiRouteMeta):
path = meta.path # type-safe
elif isinstance(meta, GitCommitMeta):
prefixes = meta.file_prefixes # type-safeGroup naming
Slugify the group key. Expand common abbreviations before slugifying:
auth→authenticationmgmt→managementcfg/config→configurationnotif→notificationsmsg→messaging
DraftFeature.name = title-cased expanded label (e.g. "User Authentication").
DraftFeature.feature_id = slugified (e.g. "user-authentication").
Confidence scoring
| Signal | Bonus |
|---|---|
| Items from >=2 different miners | +0.2 |
| Group has >=1 docstring item | +0.1 |
| Group has >=1 git item corroborating | +0.1 |
| Base score | 0.5 |
| Maximum | 1.0 |
Acceptance criteria
- 10 items from
tests/auth/all land in a single group - API routes
GET /payments/*andPOST /paymentsform a separate group fromusers - No items dropped — every
DiscoveredItemappears in exactly oneDraftFeature - Git items distributed to nearest matching group via
GitCommitMeta.file_prefixes, not siloed - Grouping uses
item.typed_meta()for type-safe metadata access (no rawmetadata["key"]) - Minimum group size: 1 item (single-item groups are valid)
- Tests in
tests/discovery/test_grouping.pyusing syntheticDiscoveredItemfixtures - Update scenarios and tests in
features/feature-spec-discovery.mdto cover the functionality introduced by this issue