AI-powered multi-agent platform for transportation operations.
- Project Overview
- Architecture
- Get Started
- Project Structure
- Usage Guide
- Environment Variables
- Technology Stack
- Troubleshooting
- Additional Documentation
OmniRoute is a multi-agent AI platform for transportation operations, designed around natural-language workflows. It uses three specialized agents - Operations, Reservations, and Insights - to answer queries about routes, trips, bookings, and delays, returning grounded responses directly from real operational data.
- Start with available data: If the database is empty, the system first prompts users to upload transportation data through CSV files or run the Data Engine to generate data before asking questions.
- Route it to the right agent: A planner sends the request to Operations, Reservations, or Insights.
- Let agents work together when needed: An agent can call another agent for follow-up work on supported multi-step questions.
- Return a grounded answer: The final response is built from database results and incident similarity only when needed.
The core focus of OmniRoute is the multi-agent AI workflow itself: routing transportation questions, letting specialist agents perform their part of the task, and returning grounded answers for routes, trips, reservations, utilization, delays, and incidents.
OmniRoute uses a layered architecture centered on multi-agent AI orchestration. The frontend collects the user request, the backend runs the planner and specialist agents, agents retrieve grounded data from PostgreSQL, and the final answer is returned to the UI. When needed, agents can call other agents for supported follow-up work, and pgvector is used only for incident narrative similarity.
- React frontend for upload, simulation, and chat
- FastAPI backend for APIs, orchestration, and data workflows
- One planner agent and three specialist agents
- PostgreSQL as the source of truth
- pgvector only for incident narrative similarity
- Optional data engine and ingest flows for loading and updating data
This is the main runtime path for the application:
flowchart TB
subgraph UI["Frontend Layer"]
A[React Frontend]
end
subgraph API["Backend Layer"]
B[FastAPI API]
C[Planner Agent]
D[Operations Agent]
E[Reservations Agent]
F[Insights Agent]
end
subgraph DATA["Data Layer"]
G[(PostgreSQL)]
H[(pgvector for incident narratives)]
end
subgraph ENGINE["Supporting Data Workflows"]
I[Data Engine]
J[CSV Ingest]
end
A -->|UI requests| B
B -->|route question| C
C -->|operations queries| D
C -->|reservation queries| E
C -->|incident and trend queries| F
D -->|agent handoff when needed| E
D -->|agent handoff when needed| F
E -->|agent handoff when needed| D
F -->|agent handoff when needed| D
D -->|SQL| G
E -->|SQL| G
F -->|SQL| G
F -->|incident similarity| H
I -->|seed and tick updates| G
J -->|validated upserts| G
J -->|incident embeddings| H
- Planner Agent classifies the question and selects the primary specialist.
- Operations Agent answers route, trip, status, delay, and capacity questions.
- Reservations Agent answers reservation, booking, cancellation, and utilization questions.
- Insights Agent answers incident explanation, trend, and similarity questions.
- Agent collaboration allows one specialist to call another specialist when a question spans more than one area.
- React Operations Console serves the home, upload, simulation, and chat interfaces.
- FastAPI API exposes health, operational data, upload, incident, simulation, and query endpoints.
- AI Orchestration Layer coordinates the planner, specialist tasks, and agent collaboration.
- Query Understanding Layer parses user text into validated intent and allowlisted internal plans.
- PostgreSQL stores structured operational data and remains the system of record.
- pgvector stores only incident narrative embeddings for similarity retrieval.
- Answer Synthesis converts SQL and vector evidence into one grounded response.
- Data Engine and Ingest Flows support data loading and background operational updates.
Before running OmniRoute, make sure you have:
- Docker and Docker Compose
- Make for the provided helper commands
- An OpenAI API key or another OpenAI-compatible provider for chat and embeddings
You need both LLM and embedding configuration for the AI workflow to work correctly.
docker --version
docker compose version
make --version
docker psFirst fork this repository to your own GitHub account, then clone your fork:
git clone <your-fork-url>
cd OmniRouteAll remaining commands in this section should be run from the repository root.
Create your local environment file from the example:
cp .env.example .envIf you are using the standard Docker Compose setup, you usually only need to
provide the LLM and embedding values from your side. The default database URL
in .env.example already matches the Docker setup.
Required values to fill in:
LLM_API_KEY=your_openai_api_key_here
EMBEDDING_API_KEY=your_openai_api_key_hereWhat you must provide from your side:
LLM_API_KEYEMBEDDING_API_KEY
If you are using OpenAI for both, you can use the same key value for both fields.
If you are not using OpenAI, update these as well:
LLM_PROVIDERLLM_BASE_URLLLM_MODELEMBEDDING_PROVIDEREMBEDDING_BASE_URLEMBEDDING_MODELEMBEDDING_DIM
For the normal Docker setup, you can keep:
DATABASE_URL=postgresql+asyncpg://omnroute:omnroute@postgres:5432/omnroutedocker compose up -d --buildThis starts:
- PostgreSQL with pgvector
- FastAPI API on
http://localhost:8000 - React web app on
http://localhost:5173
The schema is not applied automatically. From the repository root, run:
make db-initThis command runs the Alembic migrations for the API service and prepares the database schema required by OmniRoute.
OmniRoute needs transportation data before the AI workflow can answer questions. You have two options:
- Open the Upload page in the UI and import CSV files
- Run the Data Engine to generate and update sample operational data
The Data Engine is a supporting service that simulates real-time transportation activity. Its purpose is to generate routes, trips, reservations, and incidents so users do not need to prepare all the data manually before trying the application.
If you want sample data right away, seed the system from the repository root:
make engine-seedIf you want continuous updates that simulate live operational changes, run:
make engine-run- Frontend:
http://localhost:5173 - API:
http://localhost:8000 - API Docs:
http://localhost:8000/docs - Health Check:
http://localhost:8000/health
docker compose downDatabase migrations are version-controlled schema changes for PostgreSQL. They are how OmniRoute creates and updates tables, columns, constraints, and extensions in a consistent way across environments.
For normal setup, you only need:
make db-initThat command applies the current migration set and prepares the database for the application.
Advanced migration commands:
make db-upgradeApplies any newer migrations to bring the database up to the latest schema.make db-downgradeRolls back the most recent migration step. Use this during development if you need to reverse a recent schema change.make db-revision MSG="describe your schema change"Creates a new Alembic migration file when you intentionally change the database schema in development.docker compose exec postgres psql -U omnroute -d omnrouteOpens a PostgreSQL shell if you need to inspect tables or run manual queries.
The Data Engine is a separate service. It does not start during a normal docker compose up because it is behind the simulation profile.
A tick is one simulation cycle. In one tick, the Data Engine advances the
transportation state by running a single update step, such as creating or
updating trips, reservations, cancellations, delays, or incidents.
Use these helper commands when needed:
make engine-seed
make engine-tick
make engine-runWhat each command does:
make engine-seed: runs a one-off seed job and exitsmake engine-tick: runs one simulation cycle and exitsmake engine-run: starts the long-runningdata-enginecontainer, which seeds automatically when empty and then keeps ticking on the configured interval
Equivalent Docker commands:
docker compose run --rm data-engine seed
docker compose run --rm data-engine tick
docker compose run --rm data-engine status
docker compose --profile simulation up data-engineRecommended flow for a fresh environment:
docker compose up -d --build
make db-init
make engine-seed
make engine-runIf you only want sample data without continuous ticking, stop after make engine-seed.
For hot reload during development, use the compose override:
make dev-upThen apply the schema:
make db-initDevelopment behavior:
- API runs with
uvicorn --reload - Web runs with Vite dev server
- Source code is bind-mounted into containers
Stop local dev mode with:
make dev-downIf you prefer running services directly outside Docker, the rough flow is:
# API
cd server/api
python -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
cd ../..
uvicorn api.main:app --app-dir server --reload --host 0.0.0.0 --port 8000
# Web
cd client
npm install
npm run devFor that mode, you still need a PostgreSQL instance with pgvector available and
a valid DATABASE_URL.
If you run the API outside Docker, the Docker service hostname postgres will
usually not work from your host machine. In that case, point DATABASE_URL to
your local or reachable PostgreSQL instance instead, for example:
DATABASE_URL=postgresql+asyncpg://omnroute:omnroute@localhost:5432/omnrouteOmniRoute/
├── client/
│ ├── src/
│ │ ├── components/ # Shared UI and layout pieces
│ │ ├── lib/ # API client utilities
│ │ └── pages/ # Home, upload, chat, and operations pages
│ ├── Dockerfile
│ └── package.json
├── server/
│ ├── api/
│ │ ├── routes/ # FastAPI routes
│ │ ├── services/ # Query layer, simulation, ingest, embeddings
│ │ ├── models/ # SQLAlchemy + Pydantic models
│ │ ├── middleware/ # Request, logging, and error middleware
│ │ ├── main.py # FastAPI entrypoint
│ │ ├── config.py # Environment-backed settings
│ │ ├── requirements.txt
│ │ └── Dockerfile
│ ├── db/
│ │ └── migrations/ # Alembic migrations
│ ├── ingest/ # Ingest and embedding helpers
│ └── agents/ # Agent-side support files
├── docs/ # Architecture, API, DB, UI, and design notes
├── docker-compose.yml
├── docker-compose.dev.yml
├── Makefile
├── .env.example
└── README.md
- Open
http://localhost:5173 - Go to Upload
- Upload CSV files for operations, reservations, or incidents
- Or go to the Data page and seed baseline simulated data
- If you want live simulation, start the engine container with
make engine-run - Use the Simulation page to start, stop, tick, and tune the Data Engine
- Open Chat
- Ask grounded questions about routes, trips, reservations, delays, utilization, or incidents
- Chat requires route data to exist in the database
- If no route data exists, the UI redirects operators to the Upload page
- Answers should stay grounded in SQL results and incident vector matches only when incident narrative context is needed
Show delayed trips for todayWhich routes have the most incidentsCompare reservation activity across active routesWhy is route 21 delayed
OmniRoute reads configuration from the root .env file.
| Variable | Description | Default |
|---|---|---|
DATABASE_URL |
Async SQLAlchemy database URL | postgresql+asyncpg://omnroute:omnroute@postgres:5432/omnroute |
CORS_ALLOW_ORIGINS |
Allowed frontend origins | http://localhost:5173,http://127.0.0.1:5173 |
| Variable | Description | Default |
|---|---|---|
LLM_PROVIDER |
LLM provider name | openai |
LLM_MODEL |
Chat model used by the query layer | gpt-4o-mini |
LLM_API_KEY |
API key for LLM calls | empty |
LLM_BASE_URL |
Base URL for an OpenAI-compatible endpoint | https://api.openai.com/v1 |
LLM_TEMPERATURE |
Generation temperature | 0 |
LLM_TIMEOUT_SECONDS |
Timeout for LLM requests | 30 |
OmniRoute supports OpenAI and OpenAI-compatible LLM APIs. Choose the provider that best fits your setup.
Best for: production deployments and the default hosted setup
- Get API Key: https://platform.openai.com/account/api-keys
- Provider value:
openai - Change these fields:
LLM_PROVIDER=openai
LLM_API_KEY=sk-...
LLM_BASE_URL=https://api.openai.com/v1
LLM_MODEL=gpt-4o-miniBest for: local deployment, privacy, and no per-request API cost
- Install: https://ollama.com/download
- Provider value in OmniRoute:
openai_compatible - Change these fields:
If OmniRoute runs directly on your host:
LLM_PROVIDER=openai_compatible
LLM_API_KEY=ollama
LLM_BASE_URL=http://localhost:11434/v1
LLM_MODEL=qwen3:4bIf OmniRoute runs with Docker Compose and Ollama runs on your host machine:
LLM_PROVIDER=openai_compatible
LLM_API_KEY=ollama
LLM_BASE_URL=http://host.docker.internal:11434/v1
LLM_MODEL=qwen3:4bBasic setup:
# Install Ollama
curl -fsSL https://ollama.com/install.sh | sh
# Pull a model
ollama pull qwen3:4b
# Verify Ollama is running
curl http://localhost:11434/api/tagsBest for: accessing multiple models through one OpenAI-compatible endpoint
- Get API Key: https://openrouter.ai/keys
- Provider value in OmniRoute:
openai_compatible - Change these fields:
LLM_PROVIDER=openai_compatible
LLM_API_KEY=sk-or-...
LLM_BASE_URL=https://openrouter.ai/api/v1
LLM_MODEL=your-openrouter-modelBest for: internal gateways, proxy layers, or other compatible providers
Any API that implements the OpenAI-compatible format used by OmniRoute can work.
LLM_PROVIDER=openai_compatible
LLM_API_KEY=your_api_key
LLM_BASE_URL=https://your-custom-endpoint.com/v1
LLM_MODEL=your-model-nameTo switch providers, update the root .env file and restart the affected services:
# Restart API only
docker compose restart api
# Or restart the full stack
docker compose down
docker compose up -d| Variable | Description | Default |
|---|---|---|
EMBEDDING_PROVIDER |
Embedding provider name | openai |
EMBEDDING_MODEL |
Embedding model for incidents | text-embedding-3-small |
EMBEDDING_DIM |
Embedding dimension | 1536 |
EMBEDDING_API_KEY |
API key for embeddings | empty |
EMBEDDING_BASE_URL |
Base URL for embeddings endpoint | https://api.openai.com/v1 |
EMBEDDING_BATCH_SIZE |
Embedding batch size | 32 |
EMBEDDING_TIMEOUT_SECONDS |
Embedding timeout | 30 |
| Variable | Description | Default |
|---|---|---|
DATA_ENGINE_MODE |
Engine mode | run |
DATA_ENGINE_TICK_INTERVAL_SECONDS |
Tick interval | 30 |
DATA_ENGINE_SEED_IF_EMPTY |
Seed when database is empty | true |
DATA_ENGINE_SEED_ROUTES |
Number of routes to seed | 6 |
DATA_ENGINE_SEED_DAYS |
Number of future days to seed | 3 |
DATA_ENGINE_BOOKING_RATE_PER_TICK |
Reservation simulation intensity | 3.0 |
DATA_ENGINE_CANCELLATION_RATE_PER_TICK |
Cancellation intensity | 0.8 |
DATA_ENGINE_INCIDENT_RATE_PER_TICK |
Incident intensity | 0.4 |
DATA_ENGINE_DELAY_SENSITIVITY |
Delay propagation sensitivity | 1.0 |
DATA_ENGINE_ENABLE_CASCADING_DELAYS |
Enable cascading delays | true |
| Variable | Description | Default |
|---|---|---|
LANGFUSE_ENABLED |
Enable Langfuse tracing | false |
LANGFUSE_PUBLIC_KEY |
Langfuse public key | empty |
LANGFUSE_SECRET_KEY |
Langfuse secret key | empty |
LANGFUSE_HOST |
Langfuse host | http://localhost:3001 |
LANGFUSE_PROJECT |
Langfuse project name | omniroute |
LLM_MAX_CONTEXT_TOKENS |
Context window used for LLM benchmark summaries | 128000 |
EMBEDDING_MAX_CONTEXT_TOKENS |
Context window used for embedding benchmark summaries | 8191 |
The table below compares OmniRoute inference performance across local and cloud deployments using the standardized admin query benchmark flow averaged over 3 runs.
| Provider | Model | Embedding Model | Deployment | Context Window | Avg Input Tokens | Avg Output Tokens | Avg Tokens / Request | P50 Latency (ms) | P95 Latency (ms) | Throughput (req/s) |
|---|---|---|---|---|---|---|---|---|---|---|
| vLLM | Qwen/Qwen3-4B-Instruct-2507 |
BAAI/bge-base-en-v1.5 |
Local | 32,768 | 740 | 78 | 818 | 13,232 | 56,221 | 0.057 |
| Cloud LLM | gpt-4o-mini |
text-embedding-3-small |
API (Cloud) | 128,000 | 1,098 | 201 | 1,300 | 2,242 | 3,714 | 0.148 |
Notes:
- These figures summarize the OmniRoute admin query benchmark flow across 3 benchmark queries.
- The local vLLM row reflects a self-hosted LLM plus local embedding model.
- The cloud row reflects
gpt-4o-minifor query generation andtext-embedding-3-smallfor incident embeddings.- Token counts and latency can vary slightly across runs depending on prompt expansion, data state, and model behavior.
The current OmniRoute inference options balance local control against managed cloud latency depending on deployment and observability goals.
This local stack pairs an instruction-tuned Qwen language model with an English BGE embedding model for local inference and incident retrieval.
| Attribute | Details |
|---|---|
| LLM | Qwen/Qwen3-4B-Instruct-2507 |
| LLM Type | Causal language model |
| LLM Parameters | 4.0B total, 3.6B non-embedding |
| LLM Architecture | 36 layers, GQA with 32 Q heads and 8 KV heads |
| LLM Native Context Length | 262,144 tokens |
| LLM Inference Mode | Non-thinking mode only |
| LLM License | Apache 2.0 |
| Embedding Model | BAAI/bge-base-en-v1.5 |
| Embedding Type | English text embedding model |
| Embedding Dimension | 768 |
| Embedding Sequence Length | 512 |
| Embedding Variant | v1.5 with more reasonable similarity distribution |
| Embedding License | MIT |
| Deployment | Local via vLLM |
| Benchmark Context Window | 32,768 |
This cloud configuration is suitable when you want lower latency and managed API infrastructure while keeping the existing OmniRoute benchmark flow unchanged.
| Attribute | Details |
|---|---|
| Model | gpt-4o-mini |
| Embedding Model | text-embedding-3-small |
| Deployment | Cloud API |
| Benchmark Context Window | 128,000 |
| Grounded Retrieval Fit | LLM for final grounded responses, OpenAI embeddings for incident narratives |
| Data Control | Cloud-hosted inference |
| Weights | Proprietary |
| Best Fit | Managed deployments, fast iteration, and lower operational overhead |
| OmniRoute Capability | Local vLLM (Qwen/Qwen3-4B-Instruct-2507) |
Cloud API (gpt-4o-mini) |
|---|---|---|
| Grounded admin query responses | Yes | Yes |
| Incident embedding support | Yes | Yes |
| Structured output | Yes | Yes |
| Deployment mode | Local / on-prem | Cloud API |
| Data sovereignty | Full local control | No |
| Open weights | Yes | No |
| Infra ownership | Full self-managed runtime | Managed API |
| Benchmark throughput in this packet | 0.057 req/s | 0.148 req/s |
| Benchmark P50 latency in this packet | 13,232 ms | 2,242 ms |
- FastAPI
- SQLAlchemy
- Alembic
- PostgreSQL
- pgvector
- CrewAI
- OpenAI-compatible LLM and embedding APIs
- React 18
- Vite
- React Router
- Tailwind CSS
- Docker
- Docker Compose
- Python 3.12 container runtime
- Node 22 Alpine container runtime
docker compose ps
docker compose logs api --tail 100
docker compose logs web --tail 100
docker compose logs postgres --tail 100If tables are missing, migrations probably have not been run yet:
make db-initcurl http://localhost:8000/healthCheck the current engine status:
docker compose run --rm data-engine statusStart the continuous engine process:
make engine-runIf the Simulation page says the runtime is stopped, resume it there or call the start action from the UI after the container is running.
Check:
- API container is running
VITE_API_URLorVITE_API_BASE_URLpoints tohttp://localhost:8000- CORS origins include your frontend host
Check:
- Routes and trips exist in the database
- Incident embeddings are configured if you need incident similarity
LLM_API_KEYandEMBEDDING_API_KEYare valid
This project is licensed under our LICENSE file for details.
OmniRoute is provided as-is for simulation, analysis, and informational purposes. While we strive for accuracy:
- Always verify AI-generated responses against the underlying operational data
- Do not rely solely on AI-generated outputs for live transportation decisions
- Review incident explanations and summaries with human judgment
- Test thoroughly before using in shared or production-like environments
For full disclaimer details, see DISCLAIMER.md
