DiffInsight

DiffInsight is a developer tool that transforms Git diffs into clear, structured, and risk-assessed code review reports. Designed for developers, team leads, and code reviewers, it highlights what changed, why it matters, and what risks it carries — powered by a local LLM and presented in a clean, dark-themed web interface.

Upload a .diff, .patch, or .txt file (or paste your diff directly), and DiffInsight generates actionable insights, a 5-axis risk radar, and a team-aware change breakdown — all running fully locally, no cloud required.

💡 Key Features

LLM-powered code review reports via Ollama (deepseek-coder:6.7b) — runs entirely on your machine.
Two review modes: Senior Reviewer (concise, critical) and Junior Mentor (explanatory, educational).
5-axis Risk Radar — pattern-based scoring across Security, Performance, Complexity, Stability, and Testing.
Change Intelligence panel — team-aware diff breakdown that works on any language:
- Which architectural layers were touched (Backend, LLM/AI, Security, Frontend, Tests, Config, Database…)
- Per-file change classification: NEW, MODIFIED, REFACTORED, EXPANDED, DELETED
- Merge conflict candidate detection with High/Medium/Low risk per file
- Churn bar visualisation showing relative size of each change
- File type (extension) breakdown
Tech Assistant — ask any technical question, topic auto-detected, answered by the local LLM.
GitHub Explorer — search repositories by topic, filter by language, sort by stars/forks/issues/watchers/updated.
Secrets via HashiCorp Vault — GitHub token stored and retrieved securely; falls back to GITHUB_TOKEN env var.
Rate limiting — 10 requests per 60 seconds per IP.
File upload + paste — upload .diff/.patch/.txt (max 5MB) or paste a diff directly into the UI.
Health indicators — live Ollama and Vault status dots in the sidebar.
Supports standard git diff, diff -ruN, and most unified diff variants.

🌟 Why DiffInsight Matters

Accelerates code reviews — identify critical issues without manually scanning every line.
Reduces merge risk — conflict candidates are flagged before you merge.
Team-aware — when multiple people share a repo, Change Intelligence shows exactly which layers and files each diff touches, making coordination easier.
Educates junior developers — Junior Mentor mode explains changes with context and best-practice guidance.
Fully local — your code never leaves your machine. LLM inference runs via Ollama, secrets via Vault.

🎯 Target Audience

Software engineers wanting faster, more consistent code reviews.
Team leads seeking risk-aware insights before approving merges.
Junior developers learning best practices through guided diff explanations.
Teams sharing a dev machine or repo who need to coordinate changes without stepping on each other.
Open-source contributors reviewing PRs or comparing branches.

🗂️ Project Structure

diffinsight/
├── backend/
│   ├── main.py                        # FastAPI app, endpoints, radar scoring
│   ├── llm/
│   │   ├── analyzer.py                # LLM diff analysis (reviewer/junior modes)
│   │   └── tech_assistant.py          # Tech Q&A with topic detection
│   ├── security/
│   │   └── secret_manager.py          # HashiCorp Vault + env var fallback
│   ├── services/
│   │   └── github_service.py          # GitHub search (sort, filter, paginate)
│   └── utils/
│       ├── change_intelligence.py     # Team-aware diff breakdown (NEW)
│       └── risk.py                    # Risk level computation
├── frontend/
│   ├── templates/
│   │   └── index.html                 # Main UI
│   └── static/
│       ├── script.js                  # All frontend logic + Change Intelligence renderer
│       └── style.css                  # Dark theme styles
├── dev.ps1                            # PowerShell dev runner
└── requirements.txt

🛠️ Installation & Setup

Prerequisites

Python 3.10+
Ollama installed and running
HashiCorp Vault (optional — for GitHub Explorer)
A GitHub personal access token (for GitHub Explorer)

1. Clone the repository

git clone https://github.com/ShreyaVijaykumar/Diff-Insight.git
cd diffinsight

2. Install dependencies

pip install -r requirements.txt

3. Pull the LLM model

ollama pull deepseek-coder:6.7b

4. Start Ollama

ollama serve

5. Set up HashiCorp Vault (for GitHub Explorer)

Open a PowerShell terminal and start Vault in dev mode:

vault server -dev

Copy the Root Token printed in the terminal (starts with hvs.). Then open a second terminal and run:

# Set Vault address
$env:VAULT_ADDR="http://127.0.0.1:8200"

# Set your root token
$env:VAULT_TOKEN="hvs.<YOUR_ROOT_TOKEN>"

# Store your GitHub personal access token
vault kv put secret/github token=<YOUR_GITHUB_TOKEN>

To verify everything is set correctly:

echo $env:VAULT_ADDR
echo $env:VAULT_TOKEN
vault kv get secret/github

No Vault? You can skip this and set GITHUB_TOKEN=<your_token> as a regular environment variable instead. GitHub Explorer will fall back to it automatically.

6. Start DiffInsight

From the project root (diffinsight/):

uvicorn backend.main:app --reload

Or use the PowerShell dev runner:

powershell -ExecutionPolicy Bypass -File dev.ps1

7. Open in browser

http://127.0.0.1:8000/

📄 How to Generate a Git Diff File

Common commands

# Unstaged changes
git diff

# Staged changes
git diff --staged

# All changes since last commit
git diff HEAD

# Compare two commits
git diff <commit-id-1> <commit-id-2> > my_diff.txt

# Compare two branches
git diff main feature-branch > branch_diff.txt

# Compare a specific file
git diff <file-path> > file_diff.txt

# Compare tags
git diff v1.0 v1.1 > tag_diff.txt

Save the output with > to create a .txt or .diff file, then upload it to DiffInsight — or paste the output directly using the Paste Diff toggle.

Understanding diff output

Symbol	Meaning
`--- a/file.txt`	Original file
`+++ b/file.txt`	Updated file
`@@ -m,n +o,p @@`	Hunk header (line numbers)
`-`	Line removed
`+`	Line added
(no symbol)	Unchanged context

For a graphical comparison:

git difftool

🧑‍💻 How It Works

Upload or paste a .diff, .patch, or .txt file.
DiffInsight normalises the diff (handles git diff, diff -ruN, and similar formats).
Risk Radar scores the diff across 5 axes using regex-based pattern detection — no LLM needed for this step, so it's instant.
Change Intelligence parses every file in the diff and classifies it by layer, change type, and merge conflict risk — also instant, works on any language.
LLM report is generated by Ollama using the selected mode (Senior Reviewer or Junior Mentor).
Everything is displayed in the single-page dashboard — no page reload needed.

🖥️ Features In Detail

Diff Analyzer

Toggle between file upload and paste input.
Select Senior Reviewer (concise, critical) or Junior Mentor (educational, step-by-step).
Stats bar shows files changed, lines added/removed, functions modified, and overall risk level.

Risk Radar

Pattern-based scores (0–10) across five dimensions:

Axis	What it detects
Security	Hardcoded secrets, auth/crypto keywords, sensitive patterns
Performance	N+1 query patterns, loops with DB calls, missing cache/async
Complexity	Branch depth, nesting, lambda/comprehension density, net line growth
Stability	Config/migration file changes, API surface churn, deletion ratio
Testing	Assert/mock/test function presence, untested addition penalty

Change Intelligence

Replaces the dependency graph with a team-friendly breakdown that works on any language:

Summary bar — one-line description: how many files, which layers, additive/refactor/mixed/destructive.
Layers Touched — which parts of the codebase were affected (LLM/AI, Security, Backend, Frontend JS/CSS, Frontend HTML, Tests, Config/Infra, Database, Docs…).
Merge Conflict Candidates — files flagged High or Medium risk based on deletion ratio and churn volume.
File Breakdown — every changed file with change type badge, +/- counts, conflict risk, and a proportional churn bar.
File Types — extension summary for a quick "was this a backend-only or full-stack change?" read.

Tech Assistant

Ask any technical question in plain English. The assistant auto-detects the topic (40+ keywords including Python, FastAPI, Docker, PostgreSQL, Redis, Terraform, AWS, PyTorch, RAG, and more) and answers with a structured explanation, real-world example, industry use, and common misconception.

GitHub Explorer

Search GitHub repositories by topic, filter by language, and sort by:

⭐ Most Stars
🍴 Most Forks
🕒 Recently Updated
🐛 Most Issues
👁️ Most Watchers

Results show name, description, all 5 metrics, last updated date, and a direct link.

👀 Preview

DIFFINSIGHT REPORT
------------------
Risk Level : HIGH

TITLE: Refactor login flow
CHANGE_SUMMARY: Simplified authentication logic and fixed edge cases
MODIFIED_FILES: auth.py, login.py
WHAT_CHANGED: Updated login flow, added error handling
WHY_CHANGED: Improve security and readability
RISK_LEVEL: HIGH
IMPACT: High risk on authentication
REVIEWER_NOTES: Ensure unit tests are added for all new auth paths

Change Intelligence panel example output:

🔀 Mixed  |  4 files changed across 3 layers (Backend, LLM / AI, Tests) — +87 / -32 lines

Layers Touched:  ⚙️ Backend (2)   🤖 LLM / AI (1)   🧪 Tests (1)

⚠️ Merge Conflict Candidates
  backend/utils/risk.py       High risk     +2 / -9
  backend/main.py             Medium risk   +15 / -6

File Breakdown:
  REFACTORED  backend/utils/risk.py        ⚙️ Backend     +2  -9   High conflict risk   
  EXPANDED    backend/main.py              ⚙️ Backend    +15  -6  Medium conflict risk  
  NEW         backend/llm/analyzer.py      🤖 LLM / AI   +58  -0    Low conflict risk   
  MODIFIED    tests/test_risk.py           🧪 Tests      +12  -17    Low conflict risk

📈 Impact

Reduces time spent manually reviewing diffs.
Flags merge conflict candidates before they cause problems.
Gives team members visibility into which layers a change touches.
Educates junior developers through structured, mode-aware explanations.
Keeps all analysis local — no data leaves your machine.

🔒 Security Notes

GitHub tokens are stored in HashiCorp Vault (KV v2), never in code or .env files.
Vault token is stripped of whitespace on read to prevent header injection bugs.
If Vault is unavailable, the app falls back to the GITHUB_TOKEN environment variable.
The LLM runs locally via Ollama — no diff content is sent to external APIs.
Rate limiting (10 req/60s per IP) is applied to all endpoints.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
backend		backend
frontend		frontend
uploads		uploads
README.md		README.md
dev.ps1		dev.ps1
requirements.txt		requirements.txt
test_env.py		test_env.py
test_ollama.py		test_ollama.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DiffInsight

💡 Key Features

🌟 Why DiffInsight Matters

🎯 Target Audience

🗂️ Project Structure

🛠️ Installation & Setup

Prerequisites

1. Clone the repository

2. Install dependencies

3. Pull the LLM model

4. Start Ollama

5. Set up HashiCorp Vault (for GitHub Explorer)

6. Start DiffInsight

7. Open in browser

📄 How to Generate a Git Diff File

Common commands

Understanding diff output

🧑‍💻 How It Works

🖥️ Features In Detail

Diff Analyzer

Risk Radar

Change Intelligence

Tech Assistant

GitHub Explorer

👀 Preview

📈 Impact

🔒 Security Notes

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

DiffInsight

💡 Key Features

🌟 Why DiffInsight Matters

🎯 Target Audience

🗂️ Project Structure

🛠️ Installation & Setup

Prerequisites

1. Clone the repository

2. Install dependencies

3. Pull the LLM model

4. Start Ollama

5. Set up HashiCorp Vault (for GitHub Explorer)

6. Start DiffInsight

7. Open in browser

📄 How to Generate a Git Diff File

Common commands

Understanding diff output

🧑‍💻 How It Works

🖥️ Features In Detail

Diff Analyzer

Risk Radar

Change Intelligence

Tech Assistant

GitHub Explorer

👀 Preview

📈 Impact

🔒 Security Notes

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages