GitHub - uncase-ai/UNCASE: Open-source framework for turning expert knowledge into PII-free synthetic conversational data and production-ready LoRA adapters.

Unbiased Neutral Convention for Agnostic Seed Engineering
Privacy-first synthetic data pipeline for fine-tuning LLMs in regulated industries.
Blockchain-anchored quality verification on Polygon PoS.

This project is proudly

Why UNCASE?

Organizations in healthcare, finance, legal, automotive, manufacturing, and education need specialized LLMs — but fine-tuning requires real conversations full of PII, PHI, and legally privileged data.

UNCASE solves this. It generates high-quality synthetic training data from structured seeds (conversation blueprints that contain zero real data), evaluates output against 9 hard-gated quality metrics, and anchors every evaluation on Polygon PoS so auditors can independently verify results without trusting anyone.

Real Conversations → PII Scan → Parse & Validate → 9-Metric Quality Gate
  → Synthetic Generation → Re-evaluate → Blockchain Anchor → LoRA/QLoRA Fine-tune

One pipeline. Zero PII. On-chain proof.

Quick Start

Docker (recommended)

git clone https://github.com/uncase-ai/uncase.git && cd uncase
cp .env.example .env
# Edit .env → set at least one LLM key (ANTHROPIC_API_KEY, GEMINI_API_KEY, etc.)

docker-compose up -d          # API + PostgreSQL + Redis + Dashboard
docker-compose ps             # Verify all containers are running

API: http://localhost:8000 (Swagger at /docs)
Dashboard: http://localhost:3000

Optional profiles:

docker-compose --profile ml up -d              # + MLflow (port 5000)
docker-compose --profile gpu up -d             # + NVIDIA GPU support (port 8001)
docker-compose --profile observability up -d   # + Prometheus (9090) + Grafana (3001)

Local development (uv)

Requires Python 3.11+ and PostgreSQL 16+.

git clone https://github.com/uncase-ai/uncase.git && cd uncase
uv sync --extra dev
cp .env.example .env          # Set DATABASE_URL + at least one LLM key
make migrate                  # Run database migrations
make api                      # Start API on port 8000

# Optional: start the dashboard
cd frontend && npm install && npm run dev

pip (library only)

pip install uncase                       # Core
pip install "uncase[ml]"                 # + transformers, peft, trl, torch
pip install "uncase[privacy]"            # + SpaCy, Presidio NER
pip install "uncase[blockchain]"         # + web3, Polygon PoS anchoring
pip install "uncase[all]"               # Everything

Key Features

	Feature	What it does
Privacy	Privacy Interceptor + PromptShield	Zero PII tolerance — Presidio NER, SpaCy, 9 regex patterns, adversarial scan
Quality	9 hard-gated metrics	ROUGE-L, Factual Fidelity, TTR, Coherence, Semantic Fidelity, Embedding Drift, Tool Validity, Privacy, Memorization
Verification	Blockchain anchoring	Every evaluation SHA-256 hashed → Merkle tree → Polygon PoS. Verifiable via Polygonscan
Generation	LLM Gateway	Route to Claude, GPT-4, Gemini, Groq, Ollama, vLLM — privacy-intercepted
Import	Connector Hub	WhatsApp exports, webhooks, HuggingFace, CRM, custom connectors
Export	10+ formats	ChatML, Llama, Mistral, Qwen, Nemotron, Alpaca, ShareGPT — all with tool-use
Tools	30 domain tools	5 per industry across 6 verticals, with simulation and training data
Sandbox	E2B cloud	Parallel generation in isolated MicroVMs, instant demo containers
SDK	Python SDK	Sync + async programmatic access via httpx
MCP	MCP server	Expose tools to Claude Code and any MCP-compatible agent
Plugins	Marketplace	6 official plugins (one per industry), extensible architecture
Enterprise	Auth + observability	JWT, audit logging, cost tracking, rate limiting, Prometheus + Grafana

Supported Industries

Domain	Namespace	Tools	Plugin
Automotive Sales	`automotive.sales`	5	`automotive`
Medical Consultation	`medical.consultation`	5	`medical`
Legal Advisory	`legal.advisory`	5	`legal`
Financial Advisory	`finance.advisory`	5	`finance`
Industrial Support	`industrial.support`	5	`industrial`
Education Tutoring	`education.tutoring`	5	`education`

Each includes specialized seed templates, quality thresholds, compliance rules, and built-in tools.

Architecture (SCSF)

Layer	Purpose
0 — Privacy	PII detection + PromptShield adversarial scan (audit/warn/block)
1 — Parser	Multi-format parsing (CSV, JSONL), auto-detect OpenAI/ShareGPT/UNCASE formats
2 — Evaluator	9 quality metrics, composite scoring, blockchain anchoring
3 — Generator	LLM-powered parallel generation with tool-augmented conversations
4 — LoRA Pipeline	LoRA/QLoRA fine-tuning with DP-SGD (epsilon ≤ 8.0)

106+ API endpoints across 25 routers. 1,590+ tests at 73% coverage.

See docs/architecture.md for the full system diagram and database schema.

Documentation

Document	Description
Architecture	System diagram, SCSF layers, database schema
API Reference	Complete endpoint map with examples
Features	Detailed feature docs with usage examples
Configuration	Environment variables, Docker services, pip extras
Development	Setup, testing, CLI, contributing guide

Contributing

git checkout -b feat/my-feature
make check                            # lint + typecheck + tests
uv run pytest tests/privacy/          # mandatory before PR

See docs/development.md for the full guide.

License

BSD 3-Clause — Free for commercial and non-commercial use.

UNCASE — Because the best training data is data that never existed.

Name		Name	Last commit message	Last commit date
Latest commit History 201 Commits
.githooks		.githooks
.github/assets		.github/assets
alembic		alembic
contracts		contracts
docs		docs
examples/automotive_sales		examples/automotive_sales
frontend		frontend
monitoring		monitoring
scripts		scripts
tests		tests
uncase		uncase
.dockerignore		.dockerignore
.env.example		.env.example
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
Dockerfile		Dockerfile
Makefile		Makefile
README.md		README.md
alembic.ini		alembic.ini
deploy.sh		deploy.sh
docker-compose.yml		docker-compose.yml
entrypoint.sh		entrypoint.sh
pyproject.toml		pyproject.toml
railway.toml		railway.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Why UNCASE?

Quick Start

Docker (recommended)

Local development (uv)

pip (library only)

Key Features

Supported Industries

Architecture (SCSF)

Documentation

Contributing

License

About

Uh oh!

Releases

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Why UNCASE?

Quick Start

Docker (recommended)

Local development (uv)

pip (library only)

Key Features

Supported Industries

Architecture (SCSF)

Documentation

Contributing

License

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Contributors

Uh oh!

Languages