A production-ready, multi-agent AI coding system that autonomously plans, implements, tests, and commits code changes with intelligent context awareness, human approval gates, and full checkpoint/resume capabilities.
π― Intelligent Intent Routing - Distinguishes between coding tasks, status queries, and research requests
π Repository Context Awareness - Analyzes tech stack, test frameworks, and code conventions before coding
β
Human Approval Gates - Requires approval before commits and risky operations
π Checkpoint & Resume - Never lose progress; resume from any interruption
π Safe Code Search - Search repository patterns without shell execution
π Context-Aware Agents - Planner and coders know your project structure and conventions
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β USER INTERFACES β
β Telegram Bot Β· Web Chat β
ββββββββββββββββββββββββββ¬βββββββββββββββββββββββββββββββββββββ
β
ββββββββββββββββββββββββββΌβββββββββββββββββββββββββββββββββββββ
β INTELLIGENT ORCHESTRATOR β
β β
β ββββββββββββ βββββββββββββββββ ββββββββββββ β
β β Router ββ β Context Loaderββ β Planner β β
β β(Intent) β β(Repo Analysis)β β(GPT-4o-m)β β
β ββββββββββββ βββββββββββββββββ ββββββββββββ β
β ββ status β Status Node β
β ββ research β Research Node (read-only) β
β ββ resume β Resume from Checkpoint β
β ββ code β (workflow below) β
β β
β ββββββββββββ βββββββββββ βββββββββββββ ββββββββββββββ β
β β Planner ββ βCoder 1 ββ βPeer Reviewββ β Planner β β
β β(GPT-4o-m)β β(configu-β βby Coder 2 β β Review β β
β β β βrable) β β β β (final ok) β β
β ββββββββββββ βββββββββββ βββββββββββββ βββββββ¬βββββββ β
β β β β
β β βββββββββββ βββββββββββββ β β
β β βCoder 2 ββ βPeer Reviewβββββββββββ β
β β β(configu-β βby Coder 1 β β
β β β rable) β β β β
β β βββββββββββ βββββββββββββ β
β β β
β ββββββΌββββββ βββββββββββββββ ββββββββββββββββββββββββ β
β β Tester ββ β Human Gate ββ β Commit & Checkpoint β β
β β(Tools+LLMβ β(Approval) β β β β
β ββββββββββββ βββββββββββββββ ββββββββββββββββββββββββ β
ββββββββββββββββββββββββββ¬βββββββββββββββββββββββββββββββββββββ
β
ββββββββββββββββββββββββββΌβββββββββββββββββββββββββββββββββββββ
β SAFE TOOLS β
β Filesystem (sandboxed) Β· Shell (blocklist) Β· Git (curated) β
β Search (no shell) Β· Build/Test Β· Context Analysis β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Daedalus now understands different types of requests:
- Code Tasks - Full workflow with planning, implementation, testing, and approval
- Status Queries - Quick answers without modifying files ("What's the status?")
- Research Requests - Read-only analysis ("Find all uses of X in the codebase")
- Resume Commands - Continue from last checkpoint after interruption
The router automatically classifies your intent and routes to the appropriate handler.
Before planning any code changes, Daedalus analyzes your repository:
- Tech Stack Detection - Language, framework, package manager
- Test Framework Discovery - pytest, jest, unittest, etc. with correct commands
- Code Conventions - Linting tools, formatting rules, line length limits
- CI/CD Configuration - GitHub Actions, GitLab CI, Jenkins awareness
- Project Structure - Entry points, architecture patterns, dependencies
This context is injected into all agent prompts, ensuring they use the correct tools and follow your conventions.
Never auto-commit without review. Daedalus pauses before:
- Every commit - Always requires approval
- Large changes - Diffs over 400 lines
- File deletions - Any file removal operation
- CI/CD changes - Modifications to workflow configs
You receive a diff preview and can approve or reject before any changes are committed.
Full crash recovery and task resumption:
- Checkpoints saved after planning, coding, testing, and commits
- Resume from last checkpoint with a simple "continue" command
- Handles interruptions, crashes, and manual stops gracefully
- State stored in
.daedalus/checkpoints/directory
Search your codebase without shell commands:
# Agents can use: search_in_repo("pattern", "*.py", max_hits=50)
# Returns: path:line: matched_text- Pure Python implementation (no shell execution)
- Skips .git, node_modules, build artifacts automatically
- Configurable file patterns and result limits
The system uses two coders that alternate and cross-review each other:
- Even-numbered items (0, 2, 4β¦): Coder 1 implements β Coder 2 reviews
- Odd-numbered items (1, 3, 5β¦): Coder 2 implements β Coder 1 reviews
Both coders are fully configurable via .env β they can run on OpenAI, Anthropic, or local Ollama models independently.
Router (classify intent)
β Context Loader (analyze repo)
β Planner (create plan with context)
β Coder (A or B, alternating)
β Peer Review (by the OTHER coder)
β Planner Final Review
β Tester
β Human Gate (approval required)
β Commit & Checkpoint
β next item (alternate coder) or DONE
On REWORK (peer review or planner review): β back to the original coder
On TEST FAIL: β back to the original coder
On unexpected error: β STOP and re-plan
| Role | Model | Responsibility |
|---|---|---|
| Router | GPT-4o-mini | Classifies intent (code, status, research, resume) |
| Context Loader | Filesystem + LLM | Analyzes repository structure and conventions |
| Planner | GPT-4o-mini | Understands goals, creates context-aware plans, final review gate |
| Coder 1 | Configurable (see .env) |
Implements even-numbered items, peer-reviews Coder 2's work |
| Coder 2 | Configurable (see .env) |
Implements odd-numbered items, peer-reviews Coder 1's work |
| Tester | GPT-4o-mini + tools | Runs tests/linters/builds with detected commands |
| Human Gate | Interactive | Approval checkpoint before commits |
git clone https://github.com/simonabler/Daedalus.git daedalus
cd daedalus
pip install -e .Or with uv:
uv pip install -e .cp .env.example .env
# Edit .env with your API keys and settingsRequired settings:
TARGET_REPO_PATHβ path to the Git repository the agents will work on- API keys only for the providers you use:
OPENAI_API_KEY,ANTHROPIC_API_KEY CODER_1_MODEL/CODER_2_MODELβ model strings determine the provider automatically
Model string format:
- OpenAI:
gpt-4o,gpt-4o-mini, β¦ - Anthropic:
claude-opus-4-5,claude-sonnet-4-20250514, β¦ - Ollama (local):
ollama:llama3.1:70b,ollama:deepseek-coder-v2, β¦
python -m app.mainThis starts:
- Web UI at
http://127.0.0.1:8420(configurable viaWEB_HOST/WEB_PORT) - Telegram bot (if
TELEGRAM_BOT_TOKENis set) - Background task processor
Open http://127.0.0.1:8420 and type a task in the chat.
/task Add user authentication with JWT tokens
/status
/logs
/stop
Code Task (Full Workflow):
curl -X POST http://127.0.0.1:8420/api/task \
-H "Content-Type: application/json" \
-d '{"task": "Add health check endpoint to /api/health"}'Status Query (Quick Answer):
curl -X POST http://127.0.0.1:8420/api/task \
-H "Content-Type: application/json" \
-d '{"task": "What is the current status?"}'Research Request (Read-Only):
curl -X POST http://127.0.0.1:8420/api/task \
-H "Content-Type: application/json" \
-d '{"task": "Find all imports of GraphState in the codebase"}'Resume After Interruption:
curl -X POST http://127.0.0.1:8420/api/task \
-H "Content-Type: application/json" \
-d '{"task": "continue"}'When the workflow pauses for approval:
# Check pending approval
curl http://127.0.0.1:8420/api/status
# Approve and continue
curl -X POST http://127.0.0.1:8420/api/approve \
-H "Content-Type: application/json" \
-d '{"approved": true}'
# Reject and stop
curl -X POST http://127.0.0.1:8420/api/approve \
-H "Content-Type: application/json" \
-d '{"approved": false}'daedalus/
βββ AGENT.md # Instructions for AI agents working on this repo
βββ app/
β βββ core/ # Domain logic
β β βββ config.py # Settings (pydantic-settings, .env)
β β βββ logging.py # Centralized logging
β β βββ state.py # GraphState, TodoItem, enums
β β βββ nodes.py # LangGraph node implementations
β β βββ orchestrator.py # Graph builder and runner
β β βββ repo_context.py # Repository context data models
β β βββ checkpoints.py # Checkpoint save/load management
β β βββ memory.py # Shared agent memory system
β β βββ task_routing.py # Intent classification helpers
β βββ agents/ # Agent definitions
β β βββ models.py # LLM factory (role β provider)
β β βββ analyzer.py # CodebaseAnalyzer for context
β β βββ prompts/ # System prompts per role
β βββ tools/ # Safe LangChain tools
β β βββ filesystem.py # Sandboxed file I/O
β β βββ shell.py # Blocklist-protected shell
β β βββ git.py # Allow/block git operations
β β βββ search.py # Safe repository search
β β βββ build.py # Project-aware test/lint/build
β βββ web/ # FastAPI web server
β β βββ server.py # REST + WebSocket endpoints
β β βββ static/ # Web UI (HTML/CSS/JS)
β βββ telegram/ # Telegram bot
β β βββ bot.py
β βββ main.py # Entry point
βββ tasks/
β βββ todo.md # Active plan + progress tracking
β βββ lessons.md # Learned rules from mistakes
βββ memory/ # Shared agent memory
β βββ architecture-decisions.md
β βββ coding-style.md
β βββ shared-insights.md
βββ docs/
β βββ definition-of-done.md
βββ tests/ # Pytest test suite
β βββ test_router.py # Intent routing tests
β βββ test_context_loader.py
β βββ test_search.py
β βββ test_human_gate.py
β βββ test_checkpoints.py
β βββ ...
βββ logs/ # Rotating log files
βββ .env.example
βββ pyproject.toml
βββ CHANGELOG.md
βββ README.md
- All file operations sandboxed to
TARGET_REPO_PATH - Path traversal (
../) blocked and validated - Absolute paths rejected
- Commands execute only inside repo root
- Dangerous commands blocklisted (
rm -rf /,sudo,shutdown,mkfs, pipe-to-sh, etc.) - All executions logged with command, cwd, exit code, output
- Output truncated to prevent token overflow
- Configurable timeout (
SHELL_TIMEOUT_SECONDS)
- Allowed:
status,diff,add,commit,checkout,push,pull,fetch,log,branch,show,stash,tag - Blocked:
merge,rebase,reset --hard,clean -fd,push --force - Human approval required before all commits
- Feature branches only β merge is forbidden (humans merge)
- Max iterations per TODO item: configurable (
MAX_ITERATIONS_PER_ITEM, default 5) - Exceeding the limit β workflow stops and asks user for input
- Always triggers: Before every commit
- Also triggers: Large diffs (>400 lines), file deletions, CI/CD config changes
- Provides diff preview, file list, and change summary
- User can approve or reject via API
# Submit task
POST /api/task
{"task": "Add rate limiting to API endpoints"}What Happens:
- Router classifies as "code" intent
- Context Loader analyzes repository:
- Detects FastAPI framework
- Finds pytest as test framework
- Extracts ruff + black for linting/formatting
- Notes GitHub Actions CI/CD
- Planner creates plan knowing:
- Use FastAPI middleware patterns
- Test with pytest
- Follow black formatting (88 char lines)
- Don't break GitHub Actions
- Coder implements using detected conventions
- Tester runs
pytest(correct command from context!) - Human Gate pauses with diff preview
- You approve β Commits with conventional commit message
- Checkpoint saved for future resume
POST /api/task
{"task": "What's the current status?"}What Happens:
- Router classifies as "status" intent
- Status Node returns immediate answer:
- Current phase
- Todo items (total, done, in progress)
- Last commit
- No files modified β
POST /api/task
{"task": "Find all uses of 'search_in_repo' in the codebase"}What Happens:
- Router classifies as "research" intent
- Research Node uses
search_in_repotool:- Searches Python files
- Returns matches with line numbers
- No files modified β
- Read-only operation β
# Daedalus crashes while implementing item 3 of 5
# ... restart Daedalus ...
POST /api/task
{"task": "continue"}What Happens:
- Router classifies as "resume" intent
- Resume Node loads
.daedalus/checkpoints/latest.json:- Restores plan (5 items)
- Restores current position (item 3)
- Restores repo context
- Workflow continues from item 3
- No progress lost β
- Agent creates a feature branch:
feature/<date>-<slug> - Works through TODO items one at a time
- Each completed item:
- Tests pass
- Human approval obtained
- Conventional Commit created
- Checkpoint saved
- Branch pushed
- When all items done β "Ready for PR/Merge" status
- Human creates PR and merges
## Plan: Add Authentication
- [x] Item 1: Set up JWT library
- AC: JWT encode/decode works
- Verify: `pytest tests/test_auth.py`
- [ ] Item 2: Add login endpoint
- AC: POST /login returns token
- Verify: `pytest tests/test_api.py`### Rule 1: Always check for existing tests
- Date: 2026-02-21
- Mistake: Overwrote existing test file
- Rule: Read existing tests before writing new ones
- Enforcement: Coder must list test files before editing.daedalus/
βββ checkpoints/
βββ latest.json # Most recent state
βββ plan_complete_abc123.json # After planning
βββ code_complete_def456.json # After coding
βββ test_pass_ghi789.json # After tests pass
| Method | Path | Description |
|---|---|---|
| POST | /api/task |
Submit a new task (code, status, research, resume) |
| POST | /api/approve |
Approve or reject pending commit |
| GET | /api/status |
Current workflow status + pending approvals |
| GET | /api/logs |
Recent log entries |
| WS | /ws |
Real-time status + log stream |
| GET | / |
Web UI |
{
"task": "Add health check endpoint"
}Intent Classification:
- Contains "add", "fix", "implement" β code (full workflow)
- Contains "status", "what's", "show" β status (quick answer)
- Contains "find", "search", "analyze" β research (read-only)
- Contains "continue", "resume" β resume (load checkpoint)
{
"approved": true
}{
"phase": "waiting_for_approval",
"needs_human_approval": true,
"pending_approval": {
"type": "commit",
"summary": "3 files changed, 45 insertions(+), 12 deletions(-)",
"files": ["app/core/nodes.py", "tests/test_nodes.py", "README.md"],
"diff_preview": "diff --git a/app/core/nodes.py ...",
"triggers": [
{"type": "commit", "reason": "Commit requires approval"},
{"type": "large_diff", "reason": "Large diff: 57 lines changed"}
]
},
"todo_items": [...],
"current_item_index": 2
}# All tests
pytest
# Verbose output
pytest -v
# Specific test file
pytest tests/test_router.py
# With coverage
pytest --cov=app tests/
# New feature tests
pytest tests/test_router.py tests/test_context_loader.py tests/test_search.py tests/test_human_gate.py tests/test_checkpoints.pySee .env.example for all available settings. Key options:
| Variable | Default | Description |
|---|---|---|
OPENAI_API_KEY |
β | OpenAI API key (required if any model uses OpenAI) |
ANTHROPIC_API_KEY |
β | Anthropic API key (required if any model uses Anthropic) |
OLLAMA_BASE_URL |
http://localhost:11434 |
Ollama server URL (required if any model uses Ollama) |
TARGET_REPO_PATH |
(optional) | Static path to target Git repo β overrides WorkspaceManager |
DAEDALUS_WORKSPACE_DIR |
~/daedalus-workspace |
Directory where repos are cloned on-demand |
REPOS_YAML_PATH |
repos.yaml |
Path to repo registry file (see Repo Registry) |
GITLAB_URL |
https://gitlab.com |
Base URL of your GitLab instance |
GITLAB_TOKEN |
β | GitLab personal access token |
GITHUB_TOKEN |
β | GitHub personal access token |
CODER_1_MODEL |
gpt-4o-mini |
Model for Coder 1 β prefix ollama: for local models |
CODER_2_MODEL |
gpt-4o-mini |
Model for Coder 2 β prefix ollama: for local models |
PLANNER_MODEL |
gpt-4o-mini |
Model for Planner |
TESTER_MODEL |
gpt-4o-mini |
Model for Tester |
DOCUMENTER_MODEL |
gpt-4o-mini |
Model for Documenter |
TELEGRAM_BOT_TOKEN |
β | Telegram bot token (optional) |
TELEGRAM_ALLOWED_USER_IDS |
β | Comma-separated user IDs (optional) |
WEB_HOST |
127.0.0.1 |
Web UI host |
WEB_PORT |
8420 |
Web UI port |
MAX_ITERATIONS_PER_ITEM |
5 |
Max rework attempts per item |
SHELL_TIMEOUT_SECONDS |
120 |
Shell command timeout |
LOG_LEVEL |
INFO |
Logging level |
Daedalus uses repos.yaml (project root) as an access control list.
Only repositories listed there may be cloned or modified. Any task
targeting an unknown repo is rejected before the workflow starts.
# repos.yaml β repositories Daedalus is permitted to work on
repos:
- name: my-api # short alias for Telegram / Web UI
url: https://github.com/org/my-api # full HTTPS forge URL (platform auto-detected)
default_branch: main # branch to check out and target for PRs
description: "Main backend API" # shown in /status
- name: infra-scripts
url: https://gitlab.internal/ops/infra-scripts
default_branch: main
description: "Internal infrastructure scripts"You can reference a repo in any task by:
| Format | Example |
|---|---|
| Name alias | fix issue #42 in my-api |
| Full URL | add feature to https://github.com/org/my-api |
owner/name |
refactor org/my-api |
| No-scheme URL | analyse github.com/org/my-api |
If repos.yaml exists but is empty (repos: []), the registry guard
is skipped and any repo ref is accepted. Populate the file to enforce
access control.
Set REPOS_YAML_PATH=/path/to/custom-repos.yaml to use a file outside
the project root.
- β¨ Intent Routing - Automatically classifies and routes different request types
- β¨ Repository Context - Analyzes tech stack, tests, conventions before coding
- β¨ Human Approval - Required before all commits and risky operations
- β¨ Checkpointing - Save/resume state at any point
- β¨ Safe Search - Search codebase without shell commands
- β¨ Context-Aware Agents - All agents know project structure and conventions
app/core/repo_context.py- Repository context data modelsapp/core/checkpoints.py- Checkpoint managementapp/agents/analyzer.py- Codebase analyzerapp/tools/search.py- Safe search tool- Router, Context Loader, Human Gate nodes
app/core/state.py- 17 new fields for context and approvalapp/core/nodes.py- Context injection in all promptsapp/core/orchestrator.py- Entry point changed to routerapp/web/server.py- New/api/approveendpoint
tomli>=2.0.0- TOML parsing for config filespyyaml>=6.0- YAML parsing for CI/CD configs
None! All changes are backward compatible. Existing workflows continue to work.
- AGENT.md - Instructions for AI agents working on this codebase
- CHANGELOG.md - Detailed version history
- CLAUDE.md - Claude-specific guidance
- docs/definition-of-done.md - Acceptance criteria for tasks
See AGENT.md for development guidelines and workflow.
MIT
Daedalus v2.0 - Production-ready AI coding with intelligence, safety, and reliability.
