Your AI forgets. Cortex doesn't.
Memory that lives, learns, and recalls.
How It Works โข Quick Start โข Integrations โข Features โข API โข ไธญๆ
Ever told your AI something important, only to have it completely forget by the next conversation?
"Hey, I switched to decaf last week."
...two days later...
"Want me to recommend some espresso drinks?"
Your AI has no memory. Every conversation starts from zero. No matter how many times you explain your preferences, your projects, your constraints โ it's gone the moment the chat window closes.
Cortex changes that. It runs alongside your AI, quietly learning from every conversation. It knows your name, your preferences, your ongoing projects, the decisions you've made โ and surfaces exactly the right context when it matters.
Monday: "I'm allergic to shellfish and I just moved to Tokyo."
Wednesday: "Can you find me a good restaurant nearby?"
Agent: Searches for Tokyo restaurants, automatically
excludes seafood-heavy options.
(Cortex recalled: allergy + location)
No manual tagging. No "save this." It just works.
| Cortex | Mem0 | Zep | LangMem | |
|---|---|---|---|---|
| Memory lifecycle | โ 3-tier auto-promotion/decay/archive | โ Flat store | Partial | โ |
| Knowledge graph | โ Neo4j + multi-hop reasoning | โ Basic | โ | โ |
| Self-hosted | โ Single Docker container | Cloud-first | Cloud-first | Framework-bound |
| Data ownership | โ Your SQLite + Neo4j | Their cloud | Their cloud | Varies |
| Dashboard | โ Full management UI | โ | Partial | โ |
| MCP support | โ Native | โ | โ | โ |
| Multi-agent | โ Isolated namespaces | โ | โ | โ |
| Cost | ~$0.55/mo | $99+/mo | $49+/mo | Varies |
Memories aren't just stored โ they live.
Working Memory (48h) โโpromoteโโโ Core Memory โโdecayโโโ Archive
โ โ โ
โ read refreshes compress
โ decay counter back to Core
โโโโโโโโโโโโโ nothing is ever truly lost โโโโโโโโโโโ
- Working โ Core: Frequently accessed or high-value memories get promoted
- Core โ Archive: Unused memories decay over time, get compressed
- Archive โ Core: Compressed memories return when relevant again
- Time decay + read refresh + access frequency = organic memory behavior
Query โ BM25 (keywords) + Vector (semantics) โ RRF Fusion
โ Query Expansion (LLM variants)
โ LLM Reranker (optional)
โ Priority injection (constraints & persona first)
- Dual-channel: keyword precision + semantic understanding
- Query expansion: LLM generates search variants, multi-hit boost
- Reranker: LLM, Cohere, Voyage AI, Jina AI, or SiliconFlow re-scores for relevance
- Smart injection: constraints and persona always injected first, never truncated
Memories form connections. Cortex builds a knowledge graph automatically.
Alex โโusesโโโ Rust โโrelated_toโโโ Backend
โ โ
โโโworks_atโโโ Acme โโdeploys_onโโโ AWS
- Auto-extracted entity relations from every conversation
- Multi-hop reasoning: 2-hop graph traversal during recall
- Relations injected alongside memories for richer context
- Entity normalization + confidence scoring
Conversation โโโ Fast Channel (regex, 0ms) โโโ Merge โโโ 4-tier Dedup โโโ Store
โโโ Deep Channel (LLM, 2-5s) โโโ โ exact โ skip
โ near-exact โ replace
โ semantic โ LLM judge
โ new โ insert
- 20 memory categories: identity, preferences, constraints, goals, skills, relationships...
- Batch dedup: prevents "I like coffee" from becoming 50 memories
- Smart update: preference changes are updates, not new entries
- Entity relations: auto-extracted knowledge graph edges
Every memory, searchable. Every extraction, auditable.
- Memory browser with search, filter by category/status/agent
- Search debugger โ see BM25/vector/fusion scores for every query
- Extraction logs โ what was extracted, why, confidence scores
- Lifecycle preview โ dry-run promotion/decay before it happens
- Relation graph โ interactive knowledge graph visualization (sigma.js)
- Multi-agent management with per-agent config
- One-click updates with version detection
| Integration | Setup |
|---|---|
| OpenClaw | openclaw plugins install @cortexmem/openclaw |
| Claude Desktop | Add MCP config โ restart |
| Cursor / Windsurf | Add MCP server in settings |
| Claude Code | claude mcp add cortex -- npx @cortexmem/mcp |
| Any app | REST API: /api/v1/recall + /api/v1/ingest |
Conversation โโโ Fast Channel (regex) + Deep Channel (LLM)
โ
Extracted memories (categorized into 20 types)
โ
4-tier dedup (exact โ skip / near-exact โ replace / semantic โ LLM judge / new โ insert)
โ
Store as Working (48h) or Core (permanent)
โ
Extract entity relations โ Neo4j knowledge graph
User message โโโ Query Expansion (LLM generates 2-3 search variants)
โ
BM25 (keywords) + Vector (semantics) โ RRF Fusion
โ
Multi-hit boost (memories found by multiple variants rank higher)
โ
LLM Reranker (optional, re-scores for relevance)
โ
Neo4j multi-hop traversal (discovers indirect associations)
โ
Priority inject โ AI context
(constraints & persona first, then by relevance)
Working Memory (48h) โโpromoteโโโ Core Memory โโdecayโโโ Archive โโcompressโโโ back to Core
โ
read refreshes decay counter
(nothing is ever truly lost)
โโ Clients โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ OpenClaw (Bridge) โ Claude Desktop (MCP) โ Cursor โ REST โ
โโโโโโโโโโโโโโโโโโโโโโโดโโโโโโโโโโโโโโโโโโโโโโโโโดโโโโโโโโโโโดโโโโโโโโโโ
โ
โผ
โโ Cortex Server (:21100) โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ โ
โ โโ Memory Gate โโโโโโโโโโ โโ Memory Sieve โโโโโโโโโโโโโโโโโโโ โ
โ โ Query Expansion โ โ Fast Channel (regex) โ โ
โ โ BM25 + Vector Search โ โ Deep Channel (LLM) โ โ
โ โ RRF Fusion โ โ 4-tier Dedup โ โ
โ โ LLM Reranker โ โ Entity Relation Extraction โ โ
โ โ Neo4j Graph Traversal โ โ Category Classification (ร20) โ โ
โ โ Priority Injection โ โ Smart Update Detection โ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ
โ โโ Lifecycle Engine โโโโโ โโ Storage โโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ Promote / Decay โ โ SQLite + FTS5 (memories) โ โ
โ โ Archive / Compress โ โ sqlite-vec (embeddings) โ โ
โ โ Read Refresh โ โ Neo4j 5 (knowledge graph) โ โ
โ โ Cron Scheduler โ โ โ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ
โ โโ Dashboard (React SPA) โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ Memory Browser โ Search Debug โ Extraction Logs โ Graph View โ โ
โ โ Lifecycle Preview โ Agent Config โ One-click Update โ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
git clone https://github.com/rikouu/cortex.git
cd cortex
docker compose up -dOpen http://localhost:21100 โ Dashboard โ Settings โ pick your LLM provider, paste API key. Done.
No
.envfiles required for local use. Everything is configurable from the Dashboard.
By default, the Dashboard and API have no auth token โ anyone who can reach port 21100 has full access. This is fine for localhost, but read the security section below before exposing to a network.
Without Docker
Production mode (recommended):
git clone https://github.com/rikouu/cortex.git
cd cortex
pnpm install
pnpm build # Build server + dashboard
pnpm start # โ http://localhost:21100Development mode (for contributors):
pnpm dev # API only โ http://localhost:21100
# Dashboard runs separately:
cd packages/dashboard && pnpm dev # โ http://localhost:5173
โ ๏ธ In dev mode, visitinghttp://localhost:21100in browser will show a 404 โ that's normal. The Dashboard dev server runs on a separate port.
Requirements: Node.js โฅ 18, pnpm โฅ 8
Create a .env file in the project root (or set in docker-compose.yml โ environment):
| Variable | Default | Description |
|---|---|---|
CORTEX_PORT |
21100 |
Server port |
CORTEX_HOST |
127.0.0.1 |
Bind address (0.0.0.0 for LAN) |
CORTEX_AUTH_TOKEN |
(empty) | Auth token โ protects Dashboard + API |
CORTEX_DB_PATH |
cortex/brain.db |
SQLite database path |
OPENAI_API_KEY |
โ | OpenAI API key (LLM + embedding) |
ANTHROPIC_API_KEY |
โ | Anthropic API key |
OLLAMA_BASE_URL |
โ | Ollama URL for local models |
TZ |
UTC |
Timezone (e.g. Asia/Tokyo) |
LOG_LEVEL |
info |
Log level (debug, info, warn, error) |
NEO4J_URI |
โ | Neo4j connection (optional) |
NEO4J_USER |
โ | Neo4j user |
NEO4J_PASSWORD |
โ | Neo4j password |
๐ก LLM and embedding settings can also be configured in Dashboard โ Settings, which is often easier. Env vars are mainly needed for
CORTEX_AUTH_TOKEN,CORTEX_HOST, andTZ.
When CORTEX_AUTH_TOKEN is set:
- Dashboard prompts for the token on first visit (saved in browser)
- All API calls require
Authorization: Bearer <your-token>header - MCP clients and Bridge plugins must include the token in their config
When CORTEX_AUTH_TOKEN is not set (default):
- No auth required โ open access
- Fine for
localhost/ personal use โ ๏ธ Dangerous if the port is exposed to the internet
Where to find your token: It's whatever you set in CORTEX_AUTH_TOKEN. You choose it โ there's no auto-generated token. Write it down and use the same value in all client configs.
If you're exposing Cortex beyond localhost (LAN, VPN, or internet):
- Set
CORTEX_AUTH_TOKENโ use a strong random string (32+ chars) - Use HTTPS/SSL โ put a reverse proxy (Caddy, Nginx, Traefik) in front with TLS
- Restrict
CORTEX_HOSTโ bind to127.0.0.1or your Tailscale/VPN IP, not0.0.0.0 - Firewall rules โ only allow trusted IPs to reach the port
- Keep updated โ check Dashboard for version updates
# Example: strong random token
openssl rand -hex 24
# โ e.g. 3a7f2b... (use this as CORTEX_AUTH_TOKEN)
โ ๏ธ Without HTTPS, your token is sent in plaintext. Always use TLS for non-localhost deployments.
With Neo4j (knowledge graph)
Add to your docker-compose.yml:
neo4j:
image: neo4j:5-community
ports:
- "7474:7474"
- "7687:7687"
environment:
NEO4J_AUTH: neo4j/your-passwordSet env vars for Cortex:
NEO4J_URI=bolt://neo4j:7687
NEO4J_USER=neo4j
NEO4J_PASSWORD=your-password
๐ก If you set
CORTEX_AUTH_TOKEN, include it in every client config below. Examples show both with and without auth.
openclaw plugins install @cortexmem/openclawConfigure in OpenClaw's plugin settings (Dashboard or openclaw.json):
{
"cortexUrl": "http://localhost:21100",
"authToken": "your-token-here",
"agentId": "my-agent"
}Without auth: omit
authToken. Without custom agent: omitagentId(defaults to"openclaw").
The plugin auto-hooks into OpenClaw's lifecycle:
| Hook | When | What |
|---|---|---|
before_agent_start |
Before AI responds | Recalls & injects relevant memories |
agent_end |
After AI responds | Extracts & stores key information |
before_compaction |
Before context compression | Emergency save before info is lost |
Plus cortex_recall and cortex_remember tools for on-demand use.
Settings โ Developer โ Edit Config:
{
"mcpServers": {
"cortex": {
"command": "npx",
"args": ["@cortexmem/mcp", "--server-url", "http://localhost:21100"],
"env": {
"CORTEX_AUTH_TOKEN": "your-token-here",
"CORTEX_AGENT_ID": "my-agent"
}
}
}
}Without auth: remove the
CORTEX_AUTH_TOKENline fromenv.
Cursor
Settings โ MCP โ Add new global MCP server:
{
"mcpServers": {
"cortex": {
"command": "npx",
"args": ["@cortexmem/mcp"],
"env": {
"CORTEX_URL": "http://localhost:21100",
"CORTEX_AUTH_TOKEN": "your-token-here",
"CORTEX_AGENT_ID": "my-agent"
}
}
}
}Claude Code
# Without auth
claude mcp add cortex -- npx @cortexmem/mcp --server-url http://localhost:21100
# With auth + agent ID
CORTEX_AUTH_TOKEN=your-token-here CORTEX_AGENT_ID=my-agent \
claude mcp add cortex -- npx @cortexmem/mcp --server-url http://localhost:21100Windsurf / Cline / Others
{
"mcpServers": {
"cortex": {
"command": "npx",
"args": ["@cortexmem/mcp", "--server-url", "http://localhost:21100"],
"env": {
"CORTEX_AGENT_ID": "my-agent",
"CORTEX_AUTH_TOKEN": "your-token-here"
}
}
}
}# Without auth
curl -X POST http://localhost:21100/api/v1/recall \
-H "Content-Type: application/json" \
-d '{"query":"What food do I like?","agent_id":"default"}'
# With auth
curl -X POST http://localhost:21100/api/v1/ingest \
-H "Content-Type: application/json" \
-H "Authorization: Bearer your-token-here" \
-d '{"user_message":"I love sushi","assistant_message":"Noted!","agent_id":"default"}'| Tool | Description |
|---|---|
cortex_recall |
Search memories with priority injection |
cortex_remember |
Store a specific memory |
cortex_forget |
Remove or correct a memory |
cortex_search_debug |
Debug search scoring |
cortex_stats |
Memory statistics |
| Provider | Recommended Models | Notes |
|---|---|---|
| OpenAI | gpt-4o-mini, gpt-5.2 | Default. Best cost/quality |
| Anthropic | claude-haiku-4-5, claude-sonnet-4-6 | Highest extraction quality |
| Google Gemini | gemini-2.5-flash | Free tier on AI Studio |
| DeepSeek | deepseek-chat, deepseek-v4 | Cheapest option |
| DashScope | qwen-plus, qwen-turbo | ้ไนๅ้ฎ, OpenAI-compatible |
| Ollama | qwen2.5, llama3.2 | Fully local, zero cost |
| OpenRouter | Any of 100+ models | Unified gateway |
| Provider | Recommended Models | Notes |
|---|---|---|
| OpenAI | text-embedding-3-small/large | Default. Most reliable |
| Google Gemini | gemini-embedding-2, gemini-embedding-001 | Free on AI Studio |
| Voyage AI | voyage-4-large, voyage-4-lite | High quality (shared embedding space) |
| DashScope | text-embedding-v3 | ้ไนๅ้ฎ, good for Chinese |
| Ollama | bge-m3, nomic-embed-text | Local, zero cost |
โ ๏ธ Changing embedding models requires reindexing all vectors. Use Dashboard โ Settings โ Reindex Vectors.
| Provider | Recommended Models | Free Tier | Notes |
|---|---|---|---|
| LLM | (your extraction model) | โ | Highest quality, ~2-3s latency |
| Cohere | rerank-v3.5 | 1000 req/mo | Established, reliable |
| Voyage AI | rerank-2.5, rerank-2.5-lite | 200M tokens | Best free tier |
| Jina AI | jina-reranker-v2-base-multilingual | 1M tokens | Best for Chinese/multilingual |
| SiliconFlow | BAAI/bge-reranker-v2-m3 | Free tier | Open-source, low latency |
๐ก Dedicated rerankers are 10-50x faster than LLM reranking (~100ms vs ~2s). Configure in Dashboard โ Settings โ Search.
| Method | Endpoint | Description |
|---|---|---|
POST |
/api/v1/recall |
Search & inject memories |
POST |
/api/v1/ingest |
Ingest conversation |
POST |
/api/v1/flush |
Emergency flush |
POST |
/api/v1/search |
Hybrid search with debug |
CRUD |
/api/v1/memories |
Memory management |
CRUD |
/api/v1/relations |
Entity relations |
GET |
/api/v1/relations/traverse |
Multi-hop graph traversal |
GET |
/api/v1/relations/stats |
Graph statistics |
CRUD |
/api/v1/agents |
Agent management |
GET |
/api/v1/agents/:id/config |
Agent merged config |
GET |
/api/v1/extraction-logs |
Extraction audit logs |
POST |
/api/v1/lifecycle/run |
Trigger lifecycle |
GET |
/api/v1/lifecycle/preview |
Dry-run preview |
GET |
/api/v1/health |
Health check |
GET |
/api/v1/stats |
Statistics |
GET/PATCH |
/api/v1/config |
Global config |
| Setup | Monthly Cost |
|---|---|
| gpt-4o-mini + text-embedding-3-small | ~$0.55 |
| DeepSeek + Google Embedding | ~$0.10 |
| Ollama (fully local) | $0.00 |
Based on 50 conversations/day. Scales linearly.
MIT
Built with obsessive attention to how memory should work.

