GitHub - llmhut/gateway: Blazing-fast AI gateway in Go. Sub-millisecond overhead. Zero external dependencies. Drop-in for any LLM provider.

A blazing-fast, zero-dependency LLM proxy gateway. Drop it in front of any LLM provider and get caching, metrics, retries, and observability — without changing your application code.

Quick start

# 1. Clone
git clone https://github.com/llmsocket/llmsocket-gateway && cd llmsocket-gateway

# 2. Configure (copy template, add your keys)
cp .env.example .env
# Edit .env — add at least one provider key

# 3. Run
go run ./cmd/gateway

# 4. Test
curl http://localhost:4000/health
curl http://localhost:4000/ready

The banner shows exactly which providers were detected:

██╗     ██╗     ███╗   ███╗███████╗ ██████╗  ██████╗██╗  ██╗███████╗████████╗
██║     ██║     ████╗ ████║██╔════╝██╔═══██╗██╔════╝██║ ██╔╝██╔════╝╚══██╔══╝
██║     ██║     ██╔████╔██║███████╗██║   ██║██║     █████╔╝ █████╗     ██║
██║     ██║     ██║╚██╔╝██║╚════██║██║   ██║██║     ██╔═██╗ ██╔══╝     ██║
███████╗███████╗██║ ╚═╝ ██║███████║╚██████╔╝╚██████╗██║  ██╗███████╗   ██║
╚══════╝╚══════╝╚═╝     ╚═╝╚══════╝ ╚═════╝  ╚═════╝╚═╝  ╚═╝╚══════╝   ╚═╝

AI Gateway · vdev · development
─────────────────────────────────────
Gateway   → http://localhost:4000
Metrics   → http://localhost:4000/metrics
Dashboard → http://localhost:4000/ui
Config    → http://localhost:4000/config

Providers → anthropic, groq, openai
Cache     → true (TTL: 1h0m0s, max: 10000)
Retry     → true (max 2 retries on [429 502 503 504])

Gateway UI

Configuration

All config lives in .env (or real environment variables — env wins over .env).

cp .env.example .env
# Edit .env

Core settings

Variable	Default	Description
`GATEWAY_PORT`	`4000`	HTTP listen port
`GATEWAY_ENV`	`development`	`development` or `production` (JSON logs)
`GATEWAY_DEBUG`	`false`	Verbose debug logging
`GATEWAY_TIMEOUT_MS`	`120000`	Upstream timeout in milliseconds

Provider keys

Add whichever providers you use. Unset providers are simply not registered.

OPENAI_API_KEY=sk-...
ANTHROPIC_API_KEY=sk-ant-...
GEMINI_API_KEY=AIza...
GROQ_API_KEY=gsk_...
MISTRAL_API_KEY=...
TOGETHER_API_KEY=...
DEEPSEEK_API_KEY=...
XAI_API_KEY=xai-...
FIREWORKS_API_KEY=...
PERPLEXITY_API_KEY=pplx-...
COHERE_API_KEY=...
OPENROUTER_API_KEY=sk-or-...

# Azure OpenAI
AZURE_OPENAI_API_KEY=...
AZURE_OPENAI_ENDPOINT=https://YOUR_RESOURCE.openai.azure.com
AZURE_OPENAI_DEPLOYMENT=gpt-4o

# Ollama (local, no key needed)
OLLAMA_BASE_URL=http://localhost:11434

Custom base URLs

Override the base URL for any provider (useful for proxies, private deployments):

GATEWAY_OPENAI_BASE_URL=https://my-proxy.example.com
GATEWAY_ANTHROPIC_BASE_URL=https://anthropic.corp.internal

Cache, rate limiting, retry

GATEWAY_CACHE_ENABLED=true
GATEWAY_CACHE_TTL_SECONDS=3600
GATEWAY_CACHE_MAX_SIZE=10000

GATEWAY_RATELIMIT_ENABLED=false
GATEWAY_RATELIMIT_RPM=1000

GATEWAY_RETRY_ENABLED=true
GATEWAY_RETRY_MAX=2

Routing

Per-provider paths

Each provider gets a dedicated path prefix. Just set base_url in your client:

Provider	Gateway path	Example
OpenAI	`/openai/v1`	`http://localhost:4000/openai/v1`
Anthropic	`/anthropic/v1`	`http://localhost:4000/anthropic/v1`
Groq	`/groq/openai/v1`	`http://localhost:4000/groq/openai/v1`
Mistral	`/mistral/v1`	`http://localhost:4000/mistral/v1`
DeepSeek	`/deepseek/v1`	`http://localhost:4000/deepseek/v1`
Gemini	`/gemini/...`	`http://localhost:4000/gemini`
Ollama	`/ollama/v1`	`http://localhost:4000/ollama/v1`
OpenRouter	`/openrouter/api/v1`	`http://localhost:4000/openrouter/api/v1`

OpenAI-compat catch-all (`/v1`)

Route to any provider using a header — your app uses a single base URL:

curl http://localhost:4000/v1/chat/completions \
  -H "x-gateway-provider: groq" \
  -d '{"model":"llama3-70b-8192","messages":[...]}'

Per-request API key override

Override the server-configured key for a single request:

curl http://localhost:4000/openai/v1/chat/completions \
  -H "x-gateway-api-key: sk-SPECIFIC-KEY" \
  -d '...'

Drop-in usage — one line change

# Before
from openai import OpenAI
client = OpenAI(api_key="sk-...")

# After — everything else stays identical
from openai import OpenAI
client = OpenAI(base_url="http://localhost:4000/openai")

// TypeScript
const client = new OpenAI({ baseURL: "http://localhost:4000/openai" });

# curl
curl http://localhost:4000/openai/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{"model":"gpt-4o","messages":[{"role":"user","content":"hello"}]}'

Observability

Endpoints

Endpoint	Description
`GET /health`	Always returns 200 if the server is running
`GET /ready`	200 if ≥1 provider configured, 503 otherwise
`GET /config`	Active config (no secrets)
`GET /metrics`	Prometheus text format
`GET /metrics/json`	Full metrics as JSON
`GET /ui`	Live dashboard (HTML)

Prometheus

# prometheus.yml
scrape_configs:
  - job_name: gateway
    static_configs:
      - targets: ['localhost:4000']
    metrics_path: /metrics

Key metrics exposed:

gateway_requests_total — total proxied requests
gateway_errors_total — total 4xx/5xx responses
gateway_cache_hits_total — cache hit count
gateway_tokens_total{provider, model, type} — token usage
gateway_cost_usd_total{provider, model} — estimated USD cost
gateway_upstream_latency_p99_ms{provider, model} — P99 upstream latency
gateway_gateway_overhead_ms{provider, model} — gateway-added latency

Request log (JSONL)

Each request is appended to requests.jsonl (configurable):

{"ts":"2025-01-15T10:23:45Z","request_id":"abc123","provider":"openai","model":"gpt-4o-mini","status":200,"latency_ms":342,"tokens_in":120,"tokens_out":80,"cache_hit":false}

Error format

All gateway errors follow a consistent structure:

{
  "error": {
    "code": "missing_api_key",
    "message": "no API key configured for provider \"groq\"",
    "provider": "groq",
    "request_id": "req_abc123",
    "hint": "set GROQ_API_KEY in your .env file or environment, or pass the x-gateway-api-key header"
  }
}

Error code	Cause
`missing_api_key`	No key for the requested provider
`unknown_provider`	Provider name not recognized
`upstream_error`	Network/timeout error calling the provider
`invalid_provider_url`	Malformed base URL in config
`request_body_read_error`	Could not read the request body

Config errors (startup) include field, got, and hint:

╔═══════════════════════════════════════╗
║       Configuration Error             ║
╚═══════════════════════════════════════╝

  ✗  must be a valid integer (field: GATEWAY_PORT, got: "abc")

  💡 example: GATEWAY_PORT=4000

Examples

See examples/ directory:

curl_examples.sh — bash/curl quick tests
python_openai.py — Python (OpenAI SDK, streaming, per-key override, Ollama)
node_example.js — Node.js (OpenAI SDK)

# Python
pip install openai
python examples/python_openai.py openai
python examples/python_openai.py stream
python examples/python_openai.py health

# curl
bash examples/curl_examples.sh

Docker

docker-compose up

# Or with your .env file
docker run --env-file .env -p 4000:4000 gateway:latest

Architecture

Client → GATEWAY → Provider (OpenAI / Anthropic / Groq / ...)
          │
          ├── .env loader (with line-level error reporting)
          ├── LRU cache (in-memory, zero deps)
          ├── Retry with exponential backoff
          ├── Token extraction (all provider formats)
          ├── Cost estimation
          ├── Prometheus metrics
          └── JSONL request log

Supported providers

Provider	Path	Key env var
OpenAI	`/openai`	`OPENAI_API_KEY`
Anthropic	`/anthropic`	`ANTHROPIC_API_KEY`
Google Gemini	`/gemini`	`GEMINI_API_KEY`
Groq	`/groq`	`GROQ_API_KEY`
Mistral	`/mistral`	`MISTRAL_API_KEY`
Together AI	`/together`	`TOGETHER_API_KEY`
DeepSeek	`/deepseek`	`DEEPSEEK_API_KEY`
xAI / Grok	`/xai`	`XAI_API_KEY`
Fireworks	`/fireworks`	`FIREWORKS_API_KEY`
Perplexity	`/perplexity`	`PERPLEXITY_API_KEY`
Cohere	`/cohere`	`COHERE_API_KEY`
OpenRouter	`/openrouter`	`OPENROUTER_API_KEY`
Azure OpenAI	`/azure`	`AZURE_OPENAI_API_KEY`
AWS Bedrock	`/bedrock`	`AWS_ACCESS_KEY_ID`
Anyscale	`/anyscale`	`ANYSCALE_API_KEY`
Ollama (local)	`/ollama`	`OLLAMA_BASE_URL`

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
.github		.github
cmd/gateway		cmd/gateway
examples		examples
images		images
internal		internal
.gitignore		.gitignore
Dockerfile		Dockerfile
Makefile		Makefile
README.md		README.md
docker-compose.yml		docker-compose.yml
go.mod		go.mod
go.sum		go.sum

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Quick start

Gateway UI

Configuration

Core settings

Provider keys

Custom base URLs

Cache, rate limiting, retry

Routing

Per-provider paths

OpenAI-compat catch-all (`/v1`)

Per-request API key override

Drop-in usage — one line change

Observability

Endpoints

Prometheus

Request log (JSONL)

Error format

Examples

Docker

Architecture

Supported providers

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Quick start

Gateway UI

Configuration

Core settings

Provider keys

Custom base URLs

Cache, rate limiting, retry

Routing

Per-provider paths

OpenAI-compat catch-all (/v1)

Per-request API key override

Drop-in usage — one line change

Observability

Endpoints

Prometheus

Request log (JSONL)

Error format

Examples

Docker

Architecture

Supported providers

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

OpenAI-compat catch-all (`/v1`)

Packages