-
Notifications
You must be signed in to change notification settings - Fork 1
Description
Summary
Parse recent git commit history to infer feature groupings from commit messages and changed file paths. Language-agnostic — uses subprocess + git log, not tree-sitter.
New file
src/specleft/discovery/miners/shared/git_history.py
import uuid
from specleft.discovery.models import SupportedLanguage, MinerResult, DiscoveredItem, ItemKind, GitCommitMeta, MinerErrorKind
from specleft.discovery.context import MinerContext
class GitHistoryMiner:
miner_id = uuid.UUID("f1c93075-4e3c-44b8-bef6-9c0bc25b6c42")
name = "git_history"
languages = frozenset() # language-agnostic; always runs
def mine(self, ctx: MinerContext) -> MinerResult: ...Git log command
git -C {ctx.root} log --no-merges \
--format="%H%n%s%n%b%n---END---" \
--name-only -n {ctx.config.max_git_commits}Note: MAX_COMMITS is now read from ctx.config.max_git_commits (default: 200, configurable via [tool.specleft.discovery].max_git_commits in pyproject.toml).
Parsing
- Split output on
---END---separator - Per commit: extract short hash (7 chars), subject, body, list of changed files
- Skip commits whose subject matches conventional commit noise prefixes:
chore:,ci:,build:,docs:,style:,test: - Produce one
DiscoveredItemper remaining commit that has >=1 changed source file
Typed metadata
Each item's metadata dict must conform to GitCommitMeta:
GitCommitMeta(
commit_hash = "a7b21db",
subject = "feat: add login endpoint",
body = "Implements JWT-based authentication...",
changed_files = ["src/auth/login.py", "tests/test_login.py"],
conventional_type = "feat",
file_prefixes = ["src/auth", "tests"],
)name: commit subject line
file_path: None (git items span multiple files)
language: None (language-agnostic)
confidence: 0.5 (git history is a weak intent signal)
Note: languages = frozenset() means this miner always runs regardless of detected languages. The pipeline treats empty frozenset as "language-agnostic".
Error handling
If git is not on PATH or ctx.root is not a git repository:
MinerResult(
miner_id=self.miner_id,
miner_name=self.name,
items=[],
error="not a git repository",
error_kind=MinerErrorKind.NOT_INSTALLED,
duration_ms=0,
)Do not raise.
Acceptance criteria
- Running on the specleft repo returns
total_items > 0 - Merge commits excluded (
--no-merges) - Uses
ctx.config.max_git_commits— not a hardcoded constant - Each item's
metadatavalidates againstGitCommitMeta -
conventional_type="feat"parsed from"feat: add login endpoint" - Commits with subject
"chore: update lockfile"are skipped - All items have
language=None - Non-git directory returns
MinerResultwitherror+error_kind=NOT_INSTALLED, no exception - Tests in
tests/discovery/miners/test_git_history.pyusing atmp_pathgit repo fixture - Update scenarios and tests in
features/feature-spec-discovery.mdto cover the functionality introduced by this issue