LLMaven

An open-source AI control plane developed under NSF NAIRR award #240292 by UW SSEC under Schmidt Sciences Virtual Institutes for Scientific Software program.

LLMaven provides open, transparent, and useful AI-based software for scientific discovery by providing AI infrastructure that can be installed on cloud and HPC systems to access large language models, observability features, and an agentic framework for AI assisted coding for Research Software Engineering.

Overview

LLMaven leverages CLI and Pulumi-based Infrastructure as Code configurations into a single workflow. A local stack utilizing docker mirrors the cloud architecture: the same databases, object storage, AI gateway, and experiment tracking services run locally. AI harness that defines coding subagents and skills are developed, now, as Claude Code RSE-Plugins on top of this infrastructure.

Key Components

The architecture has three layers:

Layer 1 — Inference Engine: The inference engine is provided by each cloud provider or HPC. Azure via Microsoft Foundry Models, AWS via Amazon Bedrock, and GCP via Vertex AI. For local inference in HPC GPU nodes, users can utilize vLLM. This is the compute layer.

Layer 2 — API Gateway (LiteLLM + MLFlow): A lightweight proxy that provides unified access to the various models from the inference engine. It provides a single OpenAI-compatible endpoint for all researchers, handling authentication, rate limiting (RPM/TPM), per-user and per-team budgets, spend tracking, PII masking via Microsoft Presidio, and request logging to PostgreSQL as well as MLFlow. The MLFlow application allows for evaluations of AI Agents and observability. This is the main control plane layer that llmaven provides.

Layer 3 — RSE-Plugins: Claude Code plugins for domain-specific research workflows. This is the application augmentation layer that emphasizes best research software engineering practices including reproducibility, testing rigor, and adherence to Scientific Python ecosystem conventions. RSE-Plugins provides specialized AI agents and reusable knowledge modules organized in a Plugin → Agent → Skill hierarchy — covering scientific Python development (packaging, pytest, pixi environments), scientific domain application (astronomy, climate science, Earth science), structured AI research workflows (/research, /plan, /implement, /validate), project management and onboarding, and HoloViz visualization. Together these give Claude Code the context needed to guide complex feature development through documented decision-making phases while following community best practices.

Architecture

┌─────────────────────────────────────────────────────────────────┐
│  Layer 3 — RSE-Plugins (Application)                            │
│  Claude Code agents & skills for research workflows             │
│  /research → /plan → /implement → /validate                    │
├─────────────────────────────────────────────────────────────────┤
│  Layer 2 — API Gateway (Control Plane)                          │
│  LiteLLM (unified endpoint, auth, budgets, spend tracking)     │
│  MLflow (experiment tracking, agent evaluation, observability) │
│  PostgreSQL (request logs, metadata) · MinIO (artifacts)       │
├─────────────────────────────────────────────────────────────────┤
│  Layer 1 — Inference Engine (Compute)                           │
│  Azure Foundry Models · AWS Bedrock · GCP Vertex AI · vLLM    │
└─────────────────────────────────────────────────────────────────┘

         CLI (Typer)                    Docker Compose
    llmaven infra [init|               (local dev stack)
     validate|deploy]        mirrors
                            ←──────→   PostgreSQL:5432
    deployment/                        MinIO:9000/9001
    infrastructure/                    MLflow:8080
      (Pulumi → Azure)                LiteLLM:4000
                                       Qdrant:6333

Quick Start

Prerequisites

Pixi package manager
Docker and Docker Compose
Azure CLI (for infrastructure deployment)

Installation

git clone https://github.com/uw-ssec/llmaven.git
cd llmaven
pixi install

Start Local Services

The Docker Compose stack provides a full local development environment:

# Copy and configure environment variables
cp docker/.env.example docker/.env
# Edit docker/.env with your API keys

# Start all services
pixi run -e llmaven up

# Check service status
pixi run -e llmaven status

# View logs
pixi run -e llmaven logs

# Stop services
pixi run -e llmaven down

Docker Services

The local stack runs 6 services on a shared bridge network (llmaven-network):

Service	Image	Port(s)	Role
Qdrant	qdrant/qdrant:latest	6333	Vector DB for semantic search
PostgreSQL	postgres:16	5432	Relational store (3 databases)
MinIO	minio/minio:latest	9000, 9001	S3-compatible object storage
MLflow	Custom (v3.6.0)	8080	Experiment tracking & model registry
LiteLLM	Custom (v1.79.1)	4000	Unified AI gateway proxy
CreateBuckets	quay.io/minio/mc	--	Init container (creates S3 buckets)

Startup order: PostgreSQL, MinIO, Qdrant start in parallel. CreateBuckets waits for MinIO. MLflow waits for PostgreSQL, MinIO, and CreateBuckets. LiteLLM waits for PostgreSQL and MLflow.

Service UIs:

Service	URL
LiteLLM	http://localhost:4000
MLflow	http://localhost:8080
MinIO Console	http://localhost:9001
Qdrant Dashboard	http://localhost:6333/dashboard

CLI Reference

LLMaven provides a CLI built with Typer:

llmaven version                           # Show version

# Infrastructure commands
llmaven infra init --environment dev      # Generate llmaven-config.yaml
llmaven infra validate --config llmaven-config.yaml  # Validate config + cost estimate
llmaven infra deploy --preview            # Dry run (no resources created)
llmaven infra deploy --yes                # Deploy to Azure
llmaven infra status                      # View deployment status
llmaven infra destroy --yes               # Tear down resources

Azure Infrastructure Deployment

The local Docker services map directly to Azure managed equivalents, deployed via Pulumi Automation API:

Local Service	Azure Equivalent
PostgreSQL (db:5432)	Azure Database for PostgreSQL Flexible Server
MinIO (minio:9000)	Azure Blob Storage (ADLS Gen2)
MLflow (mlflow:8080)	Azure Container App (MLflow)
LiteLLM (litellm:4000)	Azure Container App (LiteLLM)

Deployment Workflow

Initialize configuration:

pixi shell -e llmaven
llmaven infra init --environment dev

Configure the generated llmaven-config.yaml:

project:
  name: llmaven
  environment: dev
  location: westus2
azure:
  subscription_id: "your-subscription-id"
database:
  sku_name: Standard_B1ms
  databases: [llmaven, mlflow_db, litellm_db]

Set secrets via environment variables:

export LLMAVEN_SECRETS_LITELLM_MASTER_KEY="$(openssl rand -base64 32)"
export LLMAVEN_SECRETS_AZURE_OPENAI_API_KEY="your-key"

Validate configuration (runs 6 checks: syntax, security, Azure prereqs, secrets, cost estimate, production readiness):
```
llmaven infra validate --strict
```

Deploy (or preview first):

llmaven infra deploy --preview   # Dry run
llmaven infra deploy --yes       # Actual deployment

Azure Resources Created

Resource Group
├── Virtual Network
│   ├── Container Apps Subnet
│   └── PostgreSQL Subnet
├── Key Vault (secrets + auto-generated credentials)
├── PostgreSQL Flexible Server (llmaven, mlflow_db, litellm_db)
├── Storage Account (ADLS Gen2: mlflow, llmaven containers)
├── Log Analytics Workspace
└── Container Apps Environment
    ├── MLflow Container App (managed identity → Key Vault)
    └── LiteLLM Container App (managed identity → Key Vault)

Development

# Run tests
pixi shell -e llmaven
pytest

# Run pre-commit hooks
pre-commit run --all-files

# Docker lifecycle
pixi run -e llmaven up        # Start services
pixi run -e llmaven down      # Stop services
pixi run -e llmaven clean     # Stop + delete all data volumes

Contributing

Contributions are welcome! Please fork the repository, create a feature branch, and submit a pull request.

See CODE_OF_CONDUCT.md for community guidelines.

License

BSD License - see LICENSE for details.

Acknowledgments

Additional Resources

RSE-Plugins - Claude Code plugins for research software engineering workflows
AGENTS.md - Technical reference for developers and AI assistants
GitHub Issues
SSEC Tutorials

Name		Name	Last commit message	Last commit date
Latest commit History 86 Commits
.devcontainer		.devcontainer
.github		.github
agent_docs		agent_docs
archive		archive
docker		docker
docs/development-history/agentic-rag		docs/development-history/agentic-rag
eval		eval
src/llmaven		src/llmaven
tests		tests
.flake8		.flake8
.gitattributes		.gitattributes
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
AGENTS.md		AGENTS.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
LICENSE		LICENSE
README.md		README.md
pixi.lock		pixi.lock
pixi.toml		pixi.toml
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LLMaven

Overview

Architecture

Quick Start

Prerequisites

Installation

Start Local Services

Docker Services

CLI Reference

Azure Infrastructure Deployment

Deployment Workflow

Azure Resources Created

Development

Contributing

License

Acknowledgments

Additional Resources

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

LLMaven

Overview

Architecture

Quick Start

Prerequisites

Installation

Start Local Services

Docker Services

CLI Reference

Azure Infrastructure Deployment

Deployment Workflow

Azure Resources Created

Development

Contributing

License

Acknowledgments

Additional Resources

About

Resources

License

Code of conduct

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages