Skip to content

Add ChromaDB as a persistent Docker service#173

Merged
mustyoshi merged 3 commits intomasterfrom
claude/interest-based-code-linking-kM66I
Feb 27, 2026
Merged

Add ChromaDB as a persistent Docker service#173
mustyoshi merged 3 commits intomasterfrom
claude/interest-based-code-linking-kM66I

Conversation

@mustyoshi
Copy link
Collaborator

  • Add congress_chromadb service (chromadb/chroma:0.6.3) to dev compose
    with a named volume (chromadb-volume) for data persistence
  • Add healthcheck against /api/v1/heartbeat
  • Add restart: unless-stopped in prod compose overlay
  • Wire CHROMA_HOST=congress_chromadb into congress_parser_fastapi so it
    finds ChromaDB via the Docker network instead of a bare IP
  • Add CHROMA_HOST env var support in uscode.py handler, falling back to
    LLM_HOST then 10.0.0.120 to preserve existing prod behaviour

https://claude.ai/code/session_011LABnV4F5UKzgwKhWj5ND6

- Add congress_chromadb service (chromadb/chroma:0.6.3) to dev compose
  with a named volume (chromadb-volume) for data persistence
- Add healthcheck against /api/v1/heartbeat
- Add restart: unless-stopped in prod compose overlay
- Wire CHROMA_HOST=congress_chromadb into congress_parser_fastapi so it
  finds ChromaDB via the Docker network instead of a bare IP
- Add CHROMA_HOST env var support in uscode.py handler, falling back to
  LLM_HOST then 10.0.0.120 to preserve existing prod behaviour

https://claude.ai/code/session_011LABnV4F5UKzgwKhWj5ND6
Adds a system for users to describe their policy interests in natural
language and automatically maps them to relevant USC sections via
ChromaDB semantic search. Bills that amend those sections are then
surfaced throughout the UI.

Backend (Python/FastAPI):
- Add UserInterest and UserInterestUscContent SQLAlchemy models
  (sensitive schema) with Alembic migration
- New interest.py handler: save interest text, run ChromaDB search
  (search_chroma, n=50), upsert auto-matched sections, toggle/add
  sections manually, query legislation via USCContentDiff join chain
- Add interest routes to user.py router (GET/POST /user/interest,
  PATCH/POST /user/interest/section, GET /user/interest/legislation)
- search_chroma now includes usc_ident in each result dict

Frontend (hillstack Next.js / tRPC):
- Add user_interest and user_interest_usc_content Prisma models
- Five new tRPC procedures on userRouter: interestGet, interestSave
  (calls FastAPI /uscode/search for ChromaDB then stores via Prisma),
  interestToggleSection, interestAddSection, interestLegislation
  ($queryRawUnsafe multi-join), interestBillMatch
- Dashboard widget: InterestFeed replaces USC Tracking placeholder,
  shows up to 8 bills touching matched sections; prompts login/setup
- Bill layout: InterestBadge client component shows green chip when
  the bill touches any of the user's active interest sections
- New page /user/interests: textarea + save button, grouped section
  list with checkbox-toggle and manual-add support
- Add FASTAPI_URL env var to congress_hillstack Docker service

https://claude.ai/code/session_011LABnV4F5UKzgwKhWj5ND6
Creates backend/congress_parser/importers/chroma_uscode.py, a standalone
async script that reads top-level US Code sections from PostgreSQL and
upserts them into the ChromaDB 'uscode' collection, enabling the
interest-based semantic search feature to find relevant sections.

Features:
- Auto-detects the latest USC release version_id from usc_release table
  (or accepts --version-id for an explicit override)
- Filters to top-level section identifiers (/us/usc/t{n}/s{identifier})
  matching the resolution path used in search_chroma()
- Builds rich document text: title name + section heading + content_str
  (truncated to 8 000 chars) for high-quality embeddings
- Stores metadata: title number, section number, display label, heading
- Idempotent: uses collection.upsert() so safe to re-run
- --reset flag to wipe and rebuild the collection from scratch
- --dry-run flag to count eligible sections without writing
- --batch-size to tune throughput (default 200)
- Creates the 'congress-dev' tenant and 'usc-chat' database via the
  ChromaDB REST API if they don't already exist
- Graceful error messages when ChromaDB is unreachable or DB has no data

Usage:
    python3 -m congress_parser.importers.chroma_uscode
    python3 -m congress_parser.importers.chroma_uscode --reset
    python3 -m congress_parser.importers.chroma_uscode --dry-run
    python3 -m congress_parser.importers.chroma_uscode --version-id 74573

https://claude.ai/code/session_011LABnV4F5UKzgwKhWj5ND6
@mustyoshi mustyoshi merged commit 45d099e into master Feb 27, 2026
1 check passed
@mustyoshi mustyoshi deleted the claude/interest-based-code-linking-kM66I branch February 27, 2026 01:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants