Self-Hosted Β· Multi-Platform Β· Sovereign
A local-first, multi-agent intelligence engine built in Go β runs on any hardware, answers to no one but you.
Architecture Β· Capabilities Β· Deployment Β· Roadmap Β· Quick Setup
Orion is a self-hosted, sovereign, multi-agent AI orchestrator designed to provide a robust, private intelligence framework that runs entirely on your own hardware.
Drawing upon modern distributed enterprise architectures, Orion utilizes Hexagonal (Ports & Adapters) principles to seamlessly coordinate an ecosystem of autonomous agents. Through powerful local Small Language Models (SLMs) running via Ollama, Orion achieves fast, private, and deterministic reasoning at the edge.
When tasks demand deeper research, complex synthesis, or heavier computational lifting, Orion's built-in High-Fidelity Router autonomously escalates workloads to state-of-the-art cloud LLMs (like Gemini) precisely when needed. It orchestrates context across deep, FTS5 RAG-backed memory, providing a personal, capable, and highly secure AI companion.
Built on Hexagonal Architecture (Ports & Adapters) β every component is swappable, every boundary is explicit.
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β INGRESS LAYER β
β Telegram (Primary) Β· Matrix E2EE (Experimental) β
ββββββββββββββββββββββββββ¬βββββββββββββββββββββββββββββββββββββ
β Whitelisted Β· Rate-Limited
ββββββββββββββββββββββββββΌβββββββββββββββββββββββββββββββββββββ
β ORCHESTRATOR β
β Intent Classification Β· Agent Routing Β· Approval Engine β
β Session Context Buffer Β· Policy Engine (policies.yaml) β
ββββββ¬ββββββββββββββ¬βββββββββββββββ¬ββββββββββββββββββββββββββββ
β β β
ββββββΌβββββ βββββββΌβββββββ ββββββΌβββββββββββββββββββββββββββ
β LOCAL β β CLOUD β β ACTION TOOLS β
β BRAIN β β ESCALATION β β Memory Β· Research Β· Files β
β Ollama β β Gemini β β Git Sync Β· Health Monitor β
ββββββ¬βββββ βββββββ¬βββββββ βββββββββββββββββββββββββββββββββ
β β
ββββββΌββββββββββββββΌβββββββββββββββββββββββββββββββββββββββββββ
β MEMORY LAYER β
β SQLCipher AES-256 Β· FTS5 RAG Β· Memories, Tasks, Alarms β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
- Multi-Agent Hierarchy β Orchestrator classifies intent and delegates to specialized Sub-Agents (
BaseAssistant,WebDev, etc.) - Autonomous Escalation β Edge models know their limits. The
cortextool routes complex workloads to cloud LLMs natively. - RAG Memory Tissue β SQLite FTS5 indexes every memory, reminder, and task for rapid semantic retrieval across sessions.
- Approval Engine β Sensitive tool calls require explicit human approval via
policies.yamlrules, with session-aware re-execution. - Proactive Heartbeat β Time-series pulse wakes the agent autonomously to monitor sites, run research, and deliver alerts.
- Zero Trust Ingress β Cloudflare Tunnel webhooks. Token-bucket rate limiting. Geometric whitelist fingerprinting.
| Module | Stack | What It Does |
|---|---|---|
| Brain | Ollama + Gemini REST | Local-first reasoning with automatic cloud escalation via cortex |
| Memory | SQLCipher (AES-256) + FTS5 | Persistent memories, reminders, tasks, and semantic RAG retrieval |
| Prompting | SOUL.md Β· AGENTS.md Β· TOOLS.md |
Hot-reloadable persona and agent prompts β no binary recompile needed |
| Relay | go-telegram Β· mautrix-go |
Webhook ingress with human-in-the-loop approval flows |
| Actuators | Go interfaces | Web research, file I/O sandbox, git sync, health monitoring |
| Networking | Cloudflare Zero Trust | Exposes webhooks securely β no port-forwarding, no static IP required |
Important
Primary Tested Platform: Orion is actively developed and rigorously tested against the Raspberry Pi 5. While the orchestration engine is built natively in Go to run anywhere, the deployment guides below for alternative platforms (NUC, VPS, macOS, Windows, NAS) are provided as architectural possibilities and community-driven references to showcase Orion's cross-platform portability.
Orion runs on your hardware β any hardware. The same binary, the same config, across every platform.
| Platform | Guide | Local AI Model |
|---|---|---|
| π₯§ Raspberry Pi 5 | DEPLOY_PI.md | qwen3.5:2b β silent, fanless, always-on |
| π₯οΈ Intel NUC / Mini-PC | DEPLOY_NUC.md | Up to deepseek-r1:32b / qwen3.5:32b β desktop-class reasoning |
| βοΈ Cloud VPS | DEPLOY_VPS.md | Up to qwen3.5:32b β always-on, zero hardware |
| π macOS Apple Silicon | DEPLOY_MACOS.md | Metal-accelerated β fastest local inference |
| πͺ Windows (Native) | DEPLOY_WINDOWS.md | NVIDIA GPU acceleration optional |
| πΎ NAS (Synology / Unraid) | DEPLOY_NAS.md | Zero extra hardware β NAS already always-on |
π Start here: Deployment Master Guide β shared prerequisites, build commands, platform selection.
-
β Phase 1β2: Foundation, RAG & Multi-Agent Logic (DONE) Hexagonal Architecture, agentic routing, SQLite RAG memory.
-
β Phase 3β4: Logging, Heartbeat & Dockerization (DONE) Structured request logging, context propagation, graceful shutdown, secret management, prompt tuning.
-
β Phase 5: Production Deployment & Zero Trust Networking (DONE) Multi-platform deployment, local-first Qwen3.5/DeepSeek deterministic reasoning, Cloudflare Zero Trust webhook ingress, Telegram primary integration, experimental E2EE Matrix integration via Conduit (Rust).
-
β Phase 6: Modular Prompting, Context & Approval Framework (DONE) 6.1: Dynamic prompt loading (
SOUL.md,AGENTS.md,TOOLS.md), hot-reload assembler with TTL caching. 6.2: Tool/skill decoupling, enhanced memory with FTS5 RAG, research cortex with deep mode, file I/O sandbox, git sync, health monitoring. 6.3: Approval framework withpolicies.yamlpolicy engine, session context buffer, approval history persistence, and re-execution flows. *6.4: Deep Audit & Hardening β pre-SIT pass covering security hardening (path traversal, rate limiting, message chunking), core architecture fixes (unified DB connection, FTS tokenizer extraction, Assembler consolidation, goroutine lifecycle), Gemini native tool calling, Dockerfile correctness, and version injection. -
β³ Phase 7: The "Face" β Dashboard & Web UI (NEXT) Interactive visual dashboard, execution trace graphing, multi-tool chains, Prometheus observability metrics, NUC distributed cluster computing, and full test coverage.
-
β³ Phase 8: The "Evolution" β LoRA Fine-tuning & Data Export (UPCOMING) Continuous learning pipelines, data export layers, hot-reloadable policies, structured error types, and local model fine-tuning.
- Go 1.26+
- CGO compiler β
gcc(Linux/macOS:build-essential/ Xcode CLT; Windows: TDM-GCC) - Ollama β running locally
cp .env.example .env
cp configs/config.yaml.example configs/config.yaml
cp configs/providers.yaml.example configs/providers.yamlPopulate .env with your API keys. Edit config.yaml for app settings (pulse interval, whitelisted IDs). Edit providers.yaml to configure AI models, roles, and agent-to-provider mapping.
You can run go run ./cmd/listmodels/main.go to fetch the latest available Gemini models after setting your API key.
Edit the USER PRIME DIRECTIVE section at the bottom of prompts/SOUL.md to change how Orion speaks and behaves. Changes take effect on the next message β no restart required.
make tidy
make build
make runCross-platform builds:
make build-linux # β bin/orion-linux
make build-macos # β bin/orion-darwin
make build-windows # β bin/orion.exeInterested in building new skills for Orion or improving existing tools? See the Contributing Guide for testing standards, architecture conventions, and how to run the test suite.
MIT License. Open logic for open autonomy.