Open-source AI knowledge base with semantic search, vector embeddings, and Claude MCP integration. Built with Python and PostgreSQL pgvector for LLM-powered document retrieval.
Features:
- Semantic Search - Vector embeddings with OpenAI (text-embedding-3-small)
- PostgreSQL + pgvector - Vector similarity operations and full-text search
- Claude MCP Integration - Model Context Protocol server for Claude Code/Desktop
- RAG Agent CLI - Interactive terminal agent with query improvement (Google Gemini)
- Python Toolkit - Clean, modular API with type hints
- Async Operations - Database-first writes with background file sync
Claude Code / Claude Desktop Terminal (kbagent)
│ │
▼ ▼
MCP Server (mcp_server.py) RAG Agent (rag_agent.py)
├─ search_summaries() ├─ Query improvement (Gemini)
├─ fetch_document() ├─ Document-only responses
├─ save_knowledge() └─ Interactive CLI
├─ update_document() │
├─ delete_document() │
└─ list_categories() │
│ │
└──────────────┬───────────────────┘
▼
Knowledge Toolkit (toolkit.py)
├─ search_summaries_tool()
├─ fetch_document_tool()
├─ knowledge_store_tool()
├─ update_document_tool()
├─ delete_document_tool()
└─ knowledge_list_categories()
│
▼
PostgreSQL + pgvector
├─ documents (full content)
├─ summaries (embeddings)
└─ vector indexes
git clone <repo>
cd knowledge-base
python -m venv .venv
source .venv/bin/activate
pip install -r requirements.txtSupabase (Cloud):
# 1. Create PostgreSQL instance on Supabase
# 2. Run schema.sql in SQL Editor
# 3. Copy connection string from settingsLocal Docker:
docker-compose up -d
psql -h localhost -U db_user -d knowledge -f schema.sqlAdd to your MCP config:
{
"mcpServers": {
"knowledge-base": {
"type": "stdio",
"command": "python",
"args": ["src/knowledge_base/mcp_server.py"],
"env": {
"DATABASE_URL": "${DATABASE_URL}",
"OPENAI_API_KEY": "${OPENAI_API_KEY}",
"ENABLE_FILE_OPERATIONS": "${ENABLE_FILE_OPERATIONS:-true}",
"KNOWLEDGE_DIR": "${KNOWLEDGE_DIR:-./knowledge}"
}
}
}
}Or use CLI:
claude mcp add --transport stdio knowledge-base \
--env DATABASE_URL="postgresql://user:password@host:5432/database" \
--env OPENAI_API_KEY="sk-..." \
-- python src/knowledge_base/mcp_server.pyUse absolute paths in config. See .env.example for details.
Now ask Claude: "Search my knowledge base for X"
Interactive terminal agent with query improvement and document-based responses.
Install:
# In project directory with venv activated
pip install -e .
# For global access, add to ~/.zshrc or ~/.bashrc:
alias kbagent="/path/to/knowledge-base/.venv/bin/kbagent"Usage:
# Interactive mode
kbagent
# Single query
kbagent "What is semantic search?"Features:
- Query improvement: Clarifies unclear questions, enhances queries for better search
- Document-only responses: Answers strictly from knowledge base content
- Source attribution with relevance scores
- Commands:
/help,/categories,/quit
Example session:
$ kbagent
Knowledge Base Agent
Type your question or /help for commands
You: python best practices
Analyzing query...
Based on the knowledge base documents, here are the key Python best practices...
Sources:
- Python Style Guide (knowledgebase) [85%]
- Clean Code Principles (knowledgebase) [72%]
Confidence: 78%
Requires GOOGLE_API_KEY in environment for query improvement (Gemini).
from knowledge_base import search_summaries_tool, knowledge_store_tool
# Search
results = search_summaries_tool("python best practices", limit=5)
# Save
response = knowledge_store_tool(
title="New Knowledge",
content="# Markdown content",
category="knowledgebase"
)Two tables with no data duplication:
- documents: Full content, metadata, category (BIGSERIAL primary key)
- summaries: Auto-generated summaries with vector embeddings (BIGSERIAL primary key, references documents with CASCADE delete)
See schema.sql for complete schema.
search_summaries_tool(query, category=None, limit=5, min_relevance=None)
- Returns: SummarySearchResponse with results list
- Each result: document_id, title, summary, relevance_score
- Optional min_relevance filter (0.0-1.0)
fetch_document_tool(document_id)
- Returns: DocumentResponse with full content, metadata
knowledge_store_tool(title, content, category, tags=None, description=None)
- Returns: OperationResponse with document_id
- Async: File write and summary generation happen in background
update_document_tool(document_id, content)
- Returns: OperationResponse with updated document metadata
- Updates content only (title/category unchanged)
- Async: File update and summary regeneration in background
delete_document_tool(document_id)
- Returns: OperationResponse with deleted document info
- Permanent operation (cannot be undone)
- Async: File cleanup in background
knowledge_list_categories()
- Returns: CategoriesResponse with all categories and counts