PrepBuddy — Interview Evaluation System

PrepBuddy is a full-stack interview preparation platform combining local NLP models with cloud LLMs (Groq Llama 3.3-70B) to generate role-specific questions and evaluate answers using multi-signal hybrid scoring pipelines. Produces rubric-backed feedback, per-question grades, and session summaries.

🚀 Features

Feature	Description
Role-aware Q&A	Generate questions by role, experience level, category, difficulty
Real-time scoring	SSE streaming evaluation with 4-signal hybrid pipeline
Dual pipelines	Technical (claim-based) + Behavioral (STAR rubric)
Rich feedback	Per-question scores, missing keywords, claim coverage, LLM feedback
Session analytics	Overall grade, strongest/weakest areas, rubric averages

🏗️ Architecture

graph TD
    A[React Frontend<br/>Chat UI + SSE] -->|HTTP/SSE| B[FastAPI Backend<br/>Python 3.11]
    B --> C[Groq LLM<br/>Llama 3.3-70B]
    B --> D[Local Models<br/>SBERT + DeBERTa + KeyBERT]
    B --> E[Scoring Pipeline<br/>Technical vs Behavioral]
    E --> F[Session Store<br/>In-memory]

Scoring Pipeline Decision Tree:

is_behavioral(category)? 
├── YES → STAR Behavioral Pipeline (60% LLM)
└── NO  → Claim-Based Technical Pipeline (50% Claims)

📁 Project Structure

PrepBuddy/
├── backend/                    # FastAPI + NLP scoring
│   ├── app/
│   │   ├── main.py            # FastAPI app entrypoint
│   │   ├── routers/           # API endpoints
│   │   ├── services/          # Business logic
│   │   └── scoring/           # Multi-signal pipelines
│   ├── evaluation/            # Research + grid search
│   └── tests/                 # pytest suite
├── frontend/                  # React 18 chat UI
│   ├── src/
│   │   ├── hooks/useChat.js   # Session state machine
│   │   ├── components/        # ChatWindow, ScoreCard, etc.
│   │   └── utils/api.js       # SSE + API wrapper
└── README.md

🛠️ Tech Stack

Category	Technologies
Backend	FastAPI, Pydantic v2, PyTorch, Sentence Transformers, spaCy, Groq SDK
Frontend	React 18, Custom Hooks, SSE (EventSource), Lucide React Icons
Scoring	SBERT (`all-MiniLM-L6-v2`), DeBERTa NLI, KeyBERT, Llama 3.3-70B/3.1-8B
Infra	uvicorn ASGI, Thread-safe ModelRegistry, Auto device detection (MPS/CUDA/CPU)

Lucide React Icons: Used in CubeIcon.js and planned for ScoreCard status indicators. Install via npm i lucide-react.

⚙️ Quick Setup

Clone & Install Backend

cd backend
python -m venv venv
source venv/bin/activate      # Linux/macOS
pip install -r requirements.txt
python -m spacy download en_core_web_sm
cp .env.example .env          # Add GROQ_API_KEY

Install Frontend

cd frontend
npm install

Run Both

# Backend (with hot reload)
uvicorn app.main:app --host 0.0.0.0 --port 8000 --reload

# Frontend
npm start                    # http://localhost:3000

API Docs: http://localhost:8000/docs

📊 Scoring Pipelines

Technical Pipeline (composite_v2) - 0-100 scale

Signal	Weight	Measures
SBERT	25%	Semantic similarity
NLI	10%	Entailment/contradiction
Keywords	5%	KeyBERT term coverage
Claims	50%	Ideal answer claim coverage
LLM Judge	10%	Correctness, completeness, clarity, depth

Behavioral Pipeline (STAR)

Signal	Weight	Measures
SBERT	25%	Semantic alignment
NLI	5%	Logical consistency
Keywords	10%	Domain terms
STAR LLM	60%	Situation/Task/Action/Result/Reflection

🔬 Evaluation Results

Grid Search Optimization (vs human-labeled dataset):

Pearson r = 0.8864 (Excellent correlation)
Optimal weights: SBERT=0.40, NLI=0.10, Keyword=0.30, LLM=0.20

Metrics Used:

BERTScore (DeBERTa-XL-MNLI)
ROUGE-1/L
Pearson/Spearman correlation

📖 API Reference

Endpoint	Method	Description
`/api/generate_questions`	POST	Create session with N diverse questions
`/api/evaluate_answer_sse`	POST	Streaming answer evaluation
`/api/evaluate_answer`	POST	Blocking answer evaluation
`/api/session/{id}`	GET	Session summary + analytics

Example Request:

{
  "role": "Software Engineer",
  "level": "Junior", 
  "category": "Data Structures",
  "difficulty": "Medium",
  "num_questions": 5
}

🧪 Testing

cd backend
pytest tests/ -v              # Backend API + scoring
cd ../frontend
npm test                      # Frontend components

Coverage: Schema validation, claim pipeline, SSE streaming, session management.

⚖️ Grading Scale

Score	Grade
80-100	Excellent
60-79	Good
40-59	Needs Improvement
0-39	Significant Gaps

🔧 Configuration

Required: GROQ_API_KEY in backend/.env

Optional Tuning:

CLAIM_EXTRACTION_MODE=regex        # or 'llm'
SBERT_WEIGHT=0.40
CLAIM_MATCH_THRESHOLD=0.62
DEVICE=mps                         # cpu/cuda/mps (auto-detect)

📈 Future Work

Persistent session storage (PostgreSQL/Redis)
Multi-language support
Custom rubrics per company/role
Voice input + transcription
Leaderboards + sharing

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
backend		backend
frontend		frontend
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PrepBuddy — Interview Evaluation System

🚀 Features

🏗️ Architecture

📁 Project Structure

🛠️ Tech Stack

⚙️ Quick Setup

📊 Scoring Pipelines

🔬 Evaluation Results

📖 API Reference

🧪 Testing

⚖️ Grading Scale

🔧 Configuration

📈 Future Work

📄 License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

PrepBuddy — Interview Evaluation System

🚀 Features

🏗️ Architecture

📁 Project Structure

🛠️ Tech Stack

⚙️ Quick Setup

📊 Scoring Pipelines

🔬 Evaluation Results

📖 API Reference

🧪 Testing

⚖️ Grading Scale

🔧 Configuration

📈 Future Work

📄 License

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages