GitHub - wittekin/millstone

Coding agents produce dramatically better results when they plan before they code, and when their output is reviewed by a second agent — ideally from a different model provider. The catch: manually running that cycle (design → review → revise → approve → plan → review → revise → implement → review → revise → commit) across multiple agents is extremely time-consuming.

millstone automates that end to end. It wraps any combination of coding CLIs (Claude Code, Codex, Gemini, OpenCode) in a deterministic build-review loop: one agent authors, a second reviews, feedback cycles until the reviewer approves, then the change is committed. The same loop governs designs, plans, and code — with optional autonomous outer loops that discover opportunities, generate designs, and break them into tasks without human prompting.

Documentation | Getting Started | Meta Invoke | Contributing | Changelog

Quick Start

Before installing millstone, install and authenticate at least one supported coding agent CLI. See Supported Agents.

# 1) Install
pipx install millstone

# 2) Move into the repo you want to run on
cd /path/to/your/project

# 3) Recommended: give your coding agent an operator prompt
# @docs/prompts/execute.md  (run a tasklist)
# @docs/prompts/design.md   (design + plan a new feature)

Common pattern: hand millstone a short list of features and let it design, plan, and implement each one. Write them as a roadmap:

<!-- docs/roadmap.md -->
- [ ] Add a logout button to the header
- [ ] Show toast notifications on form errors
- [ ] Rate-limit the /api/search endpoint

Then run:

# Local roadmap file
millstone --cycle --roadmap docs/roadmap.md

For each goal, millstone designs a solution, breaks it into atomic tasks, implements them through a build-review loop, and commits. The local roadmap path reads goals directly from the file; the remote path runs analysis to discover and select from provider-backed opportunities. Approval gates pause between stages for human review; add --no-approve for fully autonomous operation.

Other starting points:

# One task now (no setup required)
millstone --task "add retry logic to API client"

# Design, plan, and execute one objective end-to-end
millstone --deliver "Add retry logic to API client"

# Full autonomous loop — analyze codebase for improvements, then implement
millstone --cycle

# Existing local backlog file -> migrate once, then execute
millstone --migrate-tasklist backlog.md && millstone

# New app / fresh repo
millstone --init
millstone --deliver "Build a CLI app for release note generation"

millstone reads from .millstone/tasklist.md by default.

Highlights

Deterministic inner loop: Builder -> Sanity -> Reviewer -> Sanity -> Fix -> Commit.
Autonomous outer loops: analyze, design, plan, cycle — every authoring step is write/review gated.
--max-cycles governs both inner build-review iterations and outer-loop authoring loops.
Parallel execution via git worktree — run multiple tasks concurrently with isolated checkouts and a serialized merge queue.
Primary operating mode is coding-agent-invoked execution (docs/prompts/execute.md).
Built-in evaluation flow with result capture and regression comparison.
Multi-provider CLI routing per role (claude, codex, gemini, opencode).
Stateful runs with logs, evals, and recovery under .millstone/.

Usage Patterns

Goal	Command
Coding agent mediated execution (recommended)	Give your coding agent `docs/prompts/execute.md`
Execute next tasks from tasklist	`millstone`
Limit to one task	`millstone -n 1`
Run custom one-off task	`millstone --task "..."`
Migrate an existing local backlog to tasklist format	`millstone --migrate-tasklist backlog.md`
Design, plan, and execute one scoped objective	`millstone --deliver "..."`
Claude code as author, codex as reviewer, one task, max of 6 write/review cycles	`millstone --cli claude --cli-reviewer codex -n 1 --max-cycles 6`
Run 4 tasks in parallel (worktree mode)	`millstone --worktrees --concurrency 4`
Dry-run prompt flow without invoking agents	`millstone --dry-run`
Scan codebase for opportunities	`millstone --analyze`
Generate a design doc	`millstone --design "Add caching layer"`
Turn design into atomic tasks	`millstone --plan .millstone/designs/foo.md`
Analyze through planning, stop before execute	`millstone --analyze --through plan`
Design, plan, and execute from text	`millstone --design "Add caching" --through execute`
Plan and execute from existing design	`millstone --plan .millstone/designs/foo.md --through execute`
Execute roadmap goals without analyze	`millstone --cycle --roadmap docs/roadmap.md`
Run autonomous cycle end-to-end	`millstone --cycle`
Resume an interrupted run	`millstone --continue`

How It Works

Inner loop (delivery):

Builder -> Sanity Check -> Reviewer -> Sanity Check -> Fix Loop -> Commit

Outer loop (self-direction) — a composable pipeline of typed stages:

Analyze -> Design -> Plan -> [Inner Loop] -> Eval -> (repeat)

Each stage transforms one artifact type into another (Opportunity → Design → Worklist). --through controls where the pipeline stops: --analyze --through plan runs analysis, design, and planning but skips execution. --cycle resolves which pipeline to build based on pending tasks, roadmap goals, or analysis results.

Every authoring step (analyze, design, plan) is write/review gated: a reviewer agent checks the output and requests revisions until it approves or --max-cycles is exhausted.

Installation Options

# PyPI (recommended when release is available)
pipx install millstone

# GitHub latest
pipx install git+https://github.com/wittekin/millstone.git

# Contributor install
pip install -e .

Optional extras:

pip install -e .[test]      # pytest + coverage
pip install -e .[quality]   # ruff + mypy
pip install -e .[security]  # pip-audit
pip install -e .[release]   # build + twine

Minimal Tasklist Format

# Tasklist

- [ ] First task to implement
- [ ] Second task
- [x] Already completed task

millstone executes the first unchecked - [ ] task.

Configuration Snapshot

Create .millstone/config.toml in the target repo:

max_cycles = 3
max_tasks = 5
tasklist = ".millstone/tasklist.md"

cli = "claude"
cli_builder = "codex"
cli_reviewer = "claude"

eval_on_commit = false
approve_opportunities = true
approve_designs = true
approve_plans = true

Multi-maintainer setup

By default, artifact files (tasklist, designs, opportunities) are written under .millstone/ and are gitignored — suitable for single-maintainer or local-only workflows.

To commit artifacts to the repo and share them with teammates, opt in per artifact type:

commit_tasklist = true       # stores at docs/tasklist.md
commit_designs = true        # stores at designs/
commit_opportunities = true  # stores at opportunities.md

For full multi-maintainer collaboration, use an external artifact provider (Jira, Linear, or GitHub Issues) instead of file-backed defaults.

Tasklist filter contract

All tasklist providers (Jira, Linear, GitHub Issues) respect a provider-agnostic [tasklist_filter] section in .millstone/config.toml:

[tasklist_filter]
labels    = ["sprint-1"]        # AND – task must carry ALL listed labels
assignees = ["alice", "bob"]    # OR  – task assigned to ANY of these users
statuses  = ["Todo", "In Progress"]  # OR  – task in ANY of these statuses

Omit any key (or leave the list empty) to skip filtering on that dimension. The filter is applied when the outer loop fetches the next task from the remote provider. An explicit filter key inside [tasklist_provider_options] takes precedence over this section.

Scoping remote backlogs

When using a remote tasklist provider (Jira, Linear, or GitHub Issues), the default scope is the full open-issue set for the configured project/team/repo. Use [millstone.tasklist_filter] to restrict millstone to a specific subset without modifying provider options.

When to use local tasklist vs remote filters

Situation	Recommendation
Personal project or solo maintainer	Local `.millstone/tasklist.md`
Team with shared backlog in Jira/Linear/GitHub	Remote provider + `[millstone.tasklist_filter]`
Ad-hoc spike or one-off work	`millstone --task "..."`
Sprint-scoped automation on a shared board	Remote provider + label/cycle/milestone filter

Quick examples by backend

Jira — current sprint label:

[tasklist_provider_options]
type = "jira"
project = "PROJ"

[millstone.tasklist_filter]
label = "sprint-1"
assignee = "john.doe"

Linear — active cycle for a team:

[tasklist_provider_options]
type = "linear"
team_id = "<uuid>"

[millstone.tasklist_filter]
cycles = ["Cycle 5"]
label  = "millstone"

GitHub Issues — label + milestone:

[tasklist_provider_options]
type  = "github"
owner = "myorg"
repo  = "myrepo"

[millstone.tasklist_filter]
label     = "sprint-1"
milestone = "v1.2"

See full filter option reference in the per-backend docs under docs/providers/.

See full config and CLI options with:

millstone --help

Project Signals

Canonical loop ontology: docs/architecture/ontology.md
Scope and safety boundaries: docs/architecture/scope.md
Parallel execution with worktrees: docs/worktrees.md
CLI providers: docs/cli-providers/
Artifact providers: docs/providers/
Release checklist: docs/maintainer/release_checklist.md

Build and Release Workflows

This repository ships with CI, quality, docs, release, security, CodeQL, dependency review, and weekly maintenance workflows in .github/workflows/.

Tag release flow:

git tag -a vX.Y.Z -m "Release vX.Y.Z"
git push origin vX.Y.Z

Star History

Planned after initial public release and first community adoption.

Working Directory

Creates .millstone/ in your repo containing:

runs/ - Timestamped logs of each run
evals/ - JSON eval results for comparison
cycles/ - Logs of autonomous cycle decisions
state.json - Saved state for --continue (inner-loop halts and outer-loop stage checkpoints)
config.toml - Per-repo configuration
STOP.md - Created by sanity check to halt

This directory is auto-added to .gitignore.

Safety Checks

Mechanical:

No changes detected -> Warn (proceeds to review)
Too many lines changed -> Halt for human review
Sensitive files (.env, credentials) -> Halt for human review
New test failures (with --eval-on-commit) -> Halt

Judgment (via LLM):

Builder output is gibberish -> Create STOP.md -> Halt
Reviewer feedback is nonsensical -> Create STOP.md -> Halt

Exit Codes

0 - Success
1 - Halted (needs human intervention)

Expected Runtime

Depending on cycles, tasks, and your agent provider / model, millstone can run for minutes or hours.

Requirements

Python 3.10+
claude CLI installed and authenticated (default), or
codex CLI installed and authenticated (if using --cli codex), or
gemini CLI installed and authenticated (if using --cli gemini), or
opencode CLI installed and authenticated (if using --cli opencode)

Open Source Project Files

License: LICENSE
Contributing guide: CONTRIBUTING.md
Code of conduct: CODE_OF_CONDUCT.md
Security policy: SECURITY.md
Changelog: CHANGELOG.md

Name		Name	Last commit message	Last commit date
Latest commit History 43 Commits
.github		.github
docs		docs
src/millstone		src/millstone
tests		tests
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
AGENTS.md		AGENTS.md
CHANGELOG.md		CHANGELOG.md
CLAUDE.md		CLAUDE.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md
mkdocs.yml		mkdocs.yml
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Quick Start

Highlights

Usage Patterns

How It Works

Installation Options

Minimal Tasklist Format

Configuration Snapshot

Multi-maintainer setup

Tasklist filter contract

Scoping remote backlogs

Project Signals

Build and Release Workflows

Star History

Working Directory

Safety Checks

Exit Codes

Expected Runtime

Requirements

Open Source Project Files

About

Uh oh!

Releases 9

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Quick Start

Highlights

Usage Patterns

How It Works

Installation Options

Minimal Tasklist Format

Configuration Snapshot

Multi-maintainer setup

Tasklist filter contract

Scoping remote backlogs

Project Signals

Build and Release Workflows

Star History

Working Directory

Safety Checks

Exit Codes

Expected Runtime

Requirements

Open Source Project Files

About

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 9

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages