Chat with your GitHub repositories using AI-powered semantic search and analysis
RepoChat AI is a Next.js application that allows you to have intelligent conversations about any GitHub repository. It uses vector embeddings and retrieval-augmented generation (RAG) to provide accurate, context-aware answers about code structure, functionality, and implementation details.
- π Semantic Code Search - Ask questions in natural language and get relevant code snippets
- π Repository Analysis - Automatic analysis of repository structure, file tree, and languages
- π¬ Multi-Chat Support - Create multiple conversations per repository
- π¨ Modern UI - Clean, responsive interface with dark mode support
- π Secure & Private - Row-level security ensures users only access their own data
- β‘ Real-time Processing - Live updates as repositories are analyzed
- π Markdown Formatting - Rich formatting for code blocks, syntax highlighting, and more
- Frontend: Next.js 15, React, TypeScript, TailwindCSS
- Backend: Next.js API Routes
- Database: Supabase (PostgreSQL + pgvector)
- AI Models:
- Google Gemini 2.5 Flash (text generation)
- Hugging Face Sentence Transformers (embeddings)
- Authentication: Supabase Auth
- Vector Search: pgvector with HNSW indexing
- Repository Ingestion: User adds a GitHub repository URL
- Content Fetching: System fetches up to 50 relevant files from the repository
- Chunking: Content is split into manageable chunks (2000 characters with 400 character overlap)
- Embedding: Each chunk is converted to a 384-dimensional vector using
sentence-transformers/all-MiniLM-L6-v2 - Storage: Vectors are stored in Supabase with pgvector for efficient similarity search
- Query Processing: User questions are embedded and matched against stored vectors
- Context Building: Most relevant chunks are retrieved and formatted with repository metadata
- AI Response: Gemini generates comprehensive answers using the retrieved context
- Node.js 18+ and npm/yarn/pnpm
- Supabase account
- Google AI API key (for Gemini)
- Hugging Face API token (optional, for embeddings)
- GitHub personal access token (optional, for higher rate limits)
-
Clone the repository
git clone https://github.com/yourusername/repochat-ai.git cd repochat-ai -
Install dependencies
npm install # or yarn install # or pnpm install
-
Set up Supabase
Create a new Supabase project and run the migration:
# In Supabase SQL Editor, run: src/supabase/migrations/001_create_tables.sql -
Configure environment variables
Create a
.env.localfile in the root directory:# Supabase Configuration NEXT_PUBLIC_SUPABASE_URL=https://your-project.supabase.co NEXT_PUBLIC_SUPABASE_ANON_KEY=your-anon-key SUPABASE_SERVICE_ROLE_KEY=your-service-role-key # AI API Keys GOOGLE_AI_API_KEY=your-gemini-api-key HF_TOKEN=your-huggingface-token # GitHub (Optional - for higher rate limits) GITHUB_TOKEN=your-github-personal-access-token
-
Run the development server
npm run dev # or yarn dev # or pnpm dev
-
Open your browser Navigate to http://localhost:3000
- Sign up or sign in to your account
- Click "Add Repository" on the dashboard
- Enter a GitHub repository URL (e.g.,
https://github.com/owner/repo) - Wait for the analysis to complete (status will change from "Analyzing..." to "Available")
- Click on an available repository
- Create a new chat or select an existing one
- Ask questions about the codebase:
- "How does authentication work?"
- "Explain the database schema"
- "Show me the API route structure"
- "What libraries are used for styling?"
- Architecture: "What's the overall structure of this application?"
- File Navigation: "Where is the user authentication logic?"
- Code Explanation: "How does the RAG pipeline work?"
- Debugging: "Are there any potential security issues?"
- Learning: "Explain the database migration strategy"
src/
βββ app/
β βββ api/ # Next.js API routes
β β βββ auth/ # Authentication endpoints
β β βββ repositories/ # Repository management
β β βββ chats/ # Chat management
β β βββ messages/ # Message handling
β β βββ health/ # Health check
β βββ auth/ # Auth pages (sign in/up)
β βββ dashboard/ # Main dashboard
β βββ repository/ # Repository chat interface
β βββ layout.tsx # Root layout
βββ components/
β βββ ui/ # Reusable UI components
β βββ chat-input.tsx # Message input component
β βββ chat-message.tsx # Message display component
β βββ repository-card.tsx # Repository card component
β βββ sidebar.tsx # Navigation sidebar
βββ lib/
β βββ ai/ # AI client (Gemini)
β βββ auth/ # Authentication utilities
β βββ db/ # Database operations
β βββ rag/ # RAG pipeline modules
β β βββ embeddings.ts # HuggingFace embeddings
β β βββ github.ts # GitHub API integration
β β βββ vector-search.ts # pgvector search
β β βββ query.ts # RAG orchestration
β βββ supabase/ # Supabase clients
β βββ utils/ # Utility functions
βββ styles/
β βββ globals.css # Global styles
βββ supabase/
β βββ migrations/ # Database migrations
βββ utils/
βββ api.tsx # API client functions
βββ markdown.ts # Markdown processing
βββ theme-provider.tsx # Theme management
- Supabase Auth for secure user authentication
- Row-Level Security (RLS) policies enforce data isolation
- JWT-based API authentication for all protected routes
- Service role key used only in server-side operations
- Users can only access their own repositories, chats, and messages
- Repository data is deleted cascade-style when repositories are removed
- No third-party analytics or tracking
The application uses TailwindCSS with a custom design system. Modify theme colors in:
src/app/globals.css- CSS variables for light/dark modetailwind.config.ts- Tailwind configuration
This project includes Crow widget integration and MCP (Model Context Protocol) access to Crow documentation. For details on using the Crow MCP server to search Crow documentation, see docs/CROW_MCP_USAGE.md.
Adjust AI behavior in src/lib/rag/query.ts:
const MAX_CONTEXT_TOKENS = 16000; // Maximum context size
const MAX_RESPONSE_TOKENS = 2048; // Maximum response length
const CHUNK_SIZE = 2000; // Text chunk size
const CHUNK_OVERLAP = 400; // Overlap between chunks
const TOP_K_CHUNKS = 10; // Number of chunks to retrieveModify file fetching limits in src/lib/rag/github.ts:
const MAX_FILES = 50; // Maximum files to analyze per repository- repositories - GitHub repository metadata and status
- embeddings - Vector embeddings with pgvector support
- chats - Chat sessions for repositories
- messages - Individual messages within chats
- pgvector extension for efficient similarity search
- HNSW indexing for fast approximate nearest neighbor search
- Cascade deletion for data consistency
- Automatic timestamps with triggers
| Endpoint | Method | Description |
|---|---|---|
/api/auth/signup |
POST | Create new user account |
/api/repositories |
GET | List user's repositories |
/api/repositories |
POST | Add new repository |
/api/repositories/[id] |
GET | Get repository details |
/api/repositories/[id] |
DELETE | Delete repository |
/api/chats |
POST | Create new chat |
/api/chats/[repoId] |
GET | List repository chats |
/api/messages |
POST | Send message & get AI response |
/api/messages/[chatId] |
GET | Get chat messages |
/api/health |
GET | Health check |
Repository stuck in "Analyzing..." state
- Check Supabase logs for embedding errors
- Verify GOOGLE_AI_API_KEY and HF_TOKEN are set correctly
- Ensure repository is public or GITHUB_TOKEN has access
Vector search returns no results
- Verify
match_embeddingsfunction exists in Supabase - Check if embeddings were successfully stored
- Ensure embedding dimensions match (384)
Authentication errors
- Verify Supabase environment variables are correct
- Check if Supabase Auth is enabled in your project
- Ensure email confirmation is disabled or handled
Contributions are welcome! Please feel free to submit a Pull Request.
- Fork the repository
- Create your feature branch (
git checkout -b feature/AmazingFeature) - Commit your changes (
git commit -m 'Add some AmazingFeature') - Push to the branch (
git push origin feature/AmazingFeature) - Open a Pull Request
This project is licensed under the MIT License - see the LICENSE file for details.
- Supabase for the excellent Postgres + Auth platform
- Google Gemini for powerful AI text generation
- Hugging Face for embedding models
- Vercel for Next.js and hosting
- shadcn/ui for beautiful UI components
Built with β€οΈ using Next.js, Supabase, and AI