Skip to content

mardimanisha/RepoChat-AI

Repository files navigation

RepoChat AI

Chat with your GitHub repositories using AI-powered semantic search and analysis

RepoChat AI is a Next.js application that allows you to have intelligent conversations about any GitHub repository. It uses vector embeddings and retrieval-augmented generation (RAG) to provide accurate, context-aware answers about code structure, functionality, and implementation details.

License Next.js TypeScript Supabase

✨ Features

  • πŸ” Semantic Code Search - Ask questions in natural language and get relevant code snippets
  • πŸ“ Repository Analysis - Automatic analysis of repository structure, file tree, and languages
  • πŸ’¬ Multi-Chat Support - Create multiple conversations per repository
  • 🎨 Modern UI - Clean, responsive interface with dark mode support
  • πŸ”’ Secure & Private - Row-level security ensures users only access their own data
  • ⚑ Real-time Processing - Live updates as repositories are analyzed
  • πŸ“ Markdown Formatting - Rich formatting for code blocks, syntax highlighting, and more

πŸ—οΈ Architecture

Tech Stack

  • Frontend: Next.js 15, React, TypeScript, TailwindCSS
  • Backend: Next.js API Routes
  • Database: Supabase (PostgreSQL + pgvector)
  • AI Models:
    • Google Gemini 2.5 Flash (text generation)
    • Hugging Face Sentence Transformers (embeddings)
  • Authentication: Supabase Auth
  • Vector Search: pgvector with HNSW indexing

How It Works

  1. Repository Ingestion: User adds a GitHub repository URL
  2. Content Fetching: System fetches up to 50 relevant files from the repository
  3. Chunking: Content is split into manageable chunks (2000 characters with 400 character overlap)
  4. Embedding: Each chunk is converted to a 384-dimensional vector using sentence-transformers/all-MiniLM-L6-v2
  5. Storage: Vectors are stored in Supabase with pgvector for efficient similarity search
  6. Query Processing: User questions are embedded and matched against stored vectors
  7. Context Building: Most relevant chunks are retrieved and formatted with repository metadata
  8. AI Response: Gemini generates comprehensive answers using the retrieved context

πŸš€ Getting Started

Prerequisites

  • Node.js 18+ and npm/yarn/pnpm
  • Supabase account
  • Google AI API key (for Gemini)
  • Hugging Face API token (optional, for embeddings)
  • GitHub personal access token (optional, for higher rate limits)

Installation

  1. Clone the repository

    git clone https://github.com/yourusername/repochat-ai.git
    cd repochat-ai
  2. Install dependencies

    npm install
    # or
    yarn install
    # or
    pnpm install
  3. Set up Supabase

    Create a new Supabase project and run the migration:

    # In Supabase SQL Editor, run:
    src/supabase/migrations/001_create_tables.sql
  4. Configure environment variables

    Create a .env.local file in the root directory:

    # Supabase Configuration
    NEXT_PUBLIC_SUPABASE_URL=https://your-project.supabase.co
    NEXT_PUBLIC_SUPABASE_ANON_KEY=your-anon-key
    SUPABASE_SERVICE_ROLE_KEY=your-service-role-key
    
    # AI API Keys
    GOOGLE_AI_API_KEY=your-gemini-api-key
    HF_TOKEN=your-huggingface-token
    
    # GitHub (Optional - for higher rate limits)
    GITHUB_TOKEN=your-github-personal-access-token
  5. Run the development server

    npm run dev
    # or
    yarn dev
    # or
    pnpm dev
  6. Open your browser Navigate to http://localhost:3000

πŸ“– Usage

Adding a Repository

  1. Sign up or sign in to your account
  2. Click "Add Repository" on the dashboard
  3. Enter a GitHub repository URL (e.g., https://github.com/owner/repo)
  4. Wait for the analysis to complete (status will change from "Analyzing..." to "Available")

Chatting with a Repository

  1. Click on an available repository
  2. Create a new chat or select an existing one
  3. Ask questions about the codebase:
    • "How does authentication work?"
    • "Explain the database schema"
    • "Show me the API route structure"
    • "What libraries are used for styling?"

Example Questions

  • Architecture: "What's the overall structure of this application?"
  • File Navigation: "Where is the user authentication logic?"
  • Code Explanation: "How does the RAG pipeline work?"
  • Debugging: "Are there any potential security issues?"
  • Learning: "Explain the database migration strategy"

πŸ—‚οΈ Project Structure

src/
β”œβ”€β”€ app/
β”‚   β”œβ”€β”€ api/                 # Next.js API routes
β”‚   β”‚   β”œβ”€β”€ auth/           # Authentication endpoints
β”‚   β”‚   β”œβ”€β”€ repositories/   # Repository management
β”‚   β”‚   β”œβ”€β”€ chats/          # Chat management
β”‚   β”‚   β”œβ”€β”€ messages/       # Message handling
β”‚   β”‚   └── health/         # Health check
β”‚   β”œβ”€β”€ auth/               # Auth pages (sign in/up)
β”‚   β”œβ”€β”€ dashboard/          # Main dashboard
β”‚   β”œβ”€β”€ repository/         # Repository chat interface
β”‚   └── layout.tsx          # Root layout
β”œβ”€β”€ components/
β”‚   β”œβ”€β”€ ui/                 # Reusable UI components
β”‚   β”œβ”€β”€ chat-input.tsx      # Message input component
β”‚   β”œβ”€β”€ chat-message.tsx    # Message display component
β”‚   β”œβ”€β”€ repository-card.tsx # Repository card component
β”‚   └── sidebar.tsx         # Navigation sidebar
β”œβ”€β”€ lib/
β”‚   β”œβ”€β”€ ai/                 # AI client (Gemini)
β”‚   β”œβ”€β”€ auth/               # Authentication utilities
β”‚   β”œβ”€β”€ db/                 # Database operations
β”‚   β”œβ”€β”€ rag/                # RAG pipeline modules
β”‚   β”‚   β”œβ”€β”€ embeddings.ts   # HuggingFace embeddings
β”‚   β”‚   β”œβ”€β”€ github.ts       # GitHub API integration
β”‚   β”‚   β”œβ”€β”€ vector-search.ts # pgvector search
β”‚   β”‚   └── query.ts        # RAG orchestration
β”‚   β”œβ”€β”€ supabase/           # Supabase clients
β”‚   └── utils/              # Utility functions
β”œβ”€β”€ styles/
β”‚   └── globals.css         # Global styles
β”œβ”€β”€ supabase/
β”‚   └── migrations/         # Database migrations
└── utils/
    β”œβ”€β”€ api.tsx             # API client functions
    β”œβ”€β”€ markdown.ts         # Markdown processing
    └── theme-provider.tsx  # Theme management

πŸ” Security

Authentication & Authorization

  • Supabase Auth for secure user authentication
  • Row-Level Security (RLS) policies enforce data isolation
  • JWT-based API authentication for all protected routes
  • Service role key used only in server-side operations

Data Privacy

  • Users can only access their own repositories, chats, and messages
  • Repository data is deleted cascade-style when repositories are removed
  • No third-party analytics or tracking

🎨 Customization

Styling

The application uses TailwindCSS with a custom design system. Modify theme colors in:

  • src/app/globals.css - CSS variables for light/dark mode
  • tailwind.config.ts - Tailwind configuration

Crow Integration

This project includes Crow widget integration and MCP (Model Context Protocol) access to Crow documentation. For details on using the Crow MCP server to search Crow documentation, see docs/CROW_MCP_USAGE.md.

AI Model Configuration

Adjust AI behavior in src/lib/rag/query.ts:

const MAX_CONTEXT_TOKENS = 16000; // Maximum context size
const MAX_RESPONSE_TOKENS = 2048; // Maximum response length
const CHUNK_SIZE = 2000; // Text chunk size
const CHUNK_OVERLAP = 400; // Overlap between chunks
const TOP_K_CHUNKS = 10; // Number of chunks to retrieve

Repository Limits

Modify file fetching limits in src/lib/rag/github.ts:

const MAX_FILES = 50; // Maximum files to analyze per repository

πŸ“Š Database Schema

Tables

  • repositories - GitHub repository metadata and status
  • embeddings - Vector embeddings with pgvector support
  • chats - Chat sessions for repositories
  • messages - Individual messages within chats

Key Features

  • pgvector extension for efficient similarity search
  • HNSW indexing for fast approximate nearest neighbor search
  • Cascade deletion for data consistency
  • Automatic timestamps with triggers

πŸ› οΈ API Routes

Endpoint Method Description
/api/auth/signup POST Create new user account
/api/repositories GET List user's repositories
/api/repositories POST Add new repository
/api/repositories/[id] GET Get repository details
/api/repositories/[id] DELETE Delete repository
/api/chats POST Create new chat
/api/chats/[repoId] GET List repository chats
/api/messages POST Send message & get AI response
/api/messages/[chatId] GET Get chat messages
/api/health GET Health check

πŸ› Troubleshooting

Common Issues

Repository stuck in "Analyzing..." state

  • Check Supabase logs for embedding errors
  • Verify GOOGLE_AI_API_KEY and HF_TOKEN are set correctly
  • Ensure repository is public or GITHUB_TOKEN has access

Vector search returns no results

  • Verify match_embeddings function exists in Supabase
  • Check if embeddings were successfully stored
  • Ensure embedding dimensions match (384)

Authentication errors

  • Verify Supabase environment variables are correct
  • Check if Supabase Auth is enabled in your project
  • Ensure email confirmation is disabled or handled

🀝 Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

  1. Fork the repository
  2. Create your feature branch (git checkout -b feature/AmazingFeature)
  3. Commit your changes (git commit -m 'Add some AmazingFeature')
  4. Push to the branch (git push origin feature/AmazingFeature)
  5. Open a Pull Request

πŸ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.

πŸ™ Acknowledgments

  • Supabase for the excellent Postgres + Auth platform
  • Google Gemini for powerful AI text generation
  • Hugging Face for embedding models
  • Vercel for Next.js and hosting
  • shadcn/ui for beautiful UI components

Built with ❀️ using Next.js, Supabase, and AI

About

RepoChat AI is an intelligent GitHub repository assistant that lets you chat with codebases using semantic search and RAG.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors