Skip to content

llmhut/gateway

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Go License PRs Welcome Status

A blazing-fast, zero-dependency LLM proxy gateway. Drop it in front of any LLM provider and get caching, metrics, retries, and observability — without changing your application code.


gateway social preview

Quick start

# 1. Clone
git clone https://github.com/llmsocket/llmsocket-gateway && cd llmsocket-gateway

# 2. Configure (copy template, add your keys)
cp .env.example .env
# Edit .env — add at least one provider key

# 3. Run
go run ./cmd/gateway

# 4. Test
curl http://localhost:4000/health
curl http://localhost:4000/ready

The banner shows exactly which providers were detected:

██╗     ██╗     ███╗   ███╗███████╗ ██████╗  ██████╗██╗  ██╗███████╗████████╗
██║     ██║     ████╗ ████║██╔════╝██╔═══██╗██╔════╝██║ ██╔╝██╔════╝╚══██╔══╝
██║     ██║     ██╔████╔██║███████╗██║   ██║██║     █████╔╝ █████╗     ██║
██║     ██║     ██║╚██╔╝██║╚════██║██║   ██║██║     ██╔═██╗ ██╔══╝     ██║
███████╗███████╗██║ ╚═╝ ██║███████║╚██████╔╝╚██████╗██║  ██╗███████╗   ██║
╚══════╝╚══════╝╚═╝     ╚═╝╚══════╝ ╚═════╝  ╚═════╝╚═╝  ╚═╝╚══════╝   ╚═╝

AI Gateway · vdev · development
─────────────────────────────────────
Gateway   → http://localhost:4000
Metrics   → http://localhost:4000/metrics
Dashboard → http://localhost:4000/ui
Config    → http://localhost:4000/config

Providers → anthropic, groq, openai
Cache     → true (TTL: 1h0m0s, max: 10000)
Retry     → true (max 2 retries on [429 502 503 504])

Gateway UI


Configuration

All config lives in .env (or real environment variables — env wins over .env).

cp .env.example .env
# Edit .env

Core settings

Variable Default Description
GATEWAY_PORT 4000 HTTP listen port
GATEWAY_ENV development development or production (JSON logs)
GATEWAY_DEBUG false Verbose debug logging
GATEWAY_TIMEOUT_MS 120000 Upstream timeout in milliseconds

Provider keys

Add whichever providers you use. Unset providers are simply not registered.

OPENAI_API_KEY=sk-...
ANTHROPIC_API_KEY=sk-ant-...
GEMINI_API_KEY=AIza...
GROQ_API_KEY=gsk_...
MISTRAL_API_KEY=...
TOGETHER_API_KEY=...
DEEPSEEK_API_KEY=...
XAI_API_KEY=xai-...
FIREWORKS_API_KEY=...
PERPLEXITY_API_KEY=pplx-...
COHERE_API_KEY=...
OPENROUTER_API_KEY=sk-or-...

# Azure OpenAI
AZURE_OPENAI_API_KEY=...
AZURE_OPENAI_ENDPOINT=https://YOUR_RESOURCE.openai.azure.com
AZURE_OPENAI_DEPLOYMENT=gpt-4o

# Ollama (local, no key needed)
OLLAMA_BASE_URL=http://localhost:11434

Custom base URLs

Override the base URL for any provider (useful for proxies, private deployments):

GATEWAY_OPENAI_BASE_URL=https://my-proxy.example.com
GATEWAY_ANTHROPIC_BASE_URL=https://anthropic.corp.internal

Cache, rate limiting, retry

GATEWAY_CACHE_ENABLED=true
GATEWAY_CACHE_TTL_SECONDS=3600
GATEWAY_CACHE_MAX_SIZE=10000

GATEWAY_RATELIMIT_ENABLED=false
GATEWAY_RATELIMIT_RPM=1000

GATEWAY_RETRY_ENABLED=true
GATEWAY_RETRY_MAX=2

Routing

Per-provider paths

Each provider gets a dedicated path prefix. Just set base_url in your client:

Provider Gateway path Example
OpenAI /openai/v1 http://localhost:4000/openai/v1
Anthropic /anthropic/v1 http://localhost:4000/anthropic/v1
Groq /groq/openai/v1 http://localhost:4000/groq/openai/v1
Mistral /mistral/v1 http://localhost:4000/mistral/v1
DeepSeek /deepseek/v1 http://localhost:4000/deepseek/v1
Gemini /gemini/... http://localhost:4000/gemini
Ollama /ollama/v1 http://localhost:4000/ollama/v1
OpenRouter /openrouter/api/v1 http://localhost:4000/openrouter/api/v1

OpenAI-compat catch-all (/v1)

Route to any provider using a header — your app uses a single base URL:

curl http://localhost:4000/v1/chat/completions \
  -H "x-gateway-provider: groq" \
  -d '{"model":"llama3-70b-8192","messages":[...]}'

Per-request API key override

Override the server-configured key for a single request:

curl http://localhost:4000/openai/v1/chat/completions \
  -H "x-gateway-api-key: sk-SPECIFIC-KEY" \
  -d '...'

Drop-in usage — one line change

# Before
from openai import OpenAI
client = OpenAI(api_key="sk-...")

# After — everything else stays identical
from openai import OpenAI
client = OpenAI(base_url="http://localhost:4000/openai")
// TypeScript
const client = new OpenAI({ baseURL: "http://localhost:4000/openai" });
# curl
curl http://localhost:4000/openai/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{"model":"gpt-4o","messages":[{"role":"user","content":"hello"}]}'

gateway social preview


Observability

Endpoints

Endpoint Description
GET /health Always returns 200 if the server is running
GET /ready 200 if ≥1 provider configured, 503 otherwise
GET /config Active config (no secrets)
GET /metrics Prometheus text format
GET /metrics/json Full metrics as JSON
GET /ui Live dashboard (HTML)

Prometheus

# prometheus.yml
scrape_configs:
  - job_name: gateway
    static_configs:
      - targets: ['localhost:4000']
    metrics_path: /metrics

Key metrics exposed:

  • gateway_requests_total — total proxied requests
  • gateway_errors_total — total 4xx/5xx responses
  • gateway_cache_hits_total — cache hit count
  • gateway_tokens_total{provider, model, type} — token usage
  • gateway_cost_usd_total{provider, model} — estimated USD cost
  • gateway_upstream_latency_p99_ms{provider, model} — P99 upstream latency
  • gateway_gateway_overhead_ms{provider, model} — gateway-added latency

Request log (JSONL)

Each request is appended to requests.jsonl (configurable):

{"ts":"2025-01-15T10:23:45Z","request_id":"abc123","provider":"openai","model":"gpt-4o-mini","status":200,"latency_ms":342,"tokens_in":120,"tokens_out":80,"cache_hit":false}

Error format

All gateway errors follow a consistent structure:

{
  "error": {
    "code": "missing_api_key",
    "message": "no API key configured for provider \"groq\"",
    "provider": "groq",
    "request_id": "req_abc123",
    "hint": "set GROQ_API_KEY in your .env file or environment, or pass the x-gateway-api-key header"
  }
}
Error code Cause
missing_api_key No key for the requested provider
unknown_provider Provider name not recognized
upstream_error Network/timeout error calling the provider
invalid_provider_url Malformed base URL in config
request_body_read_error Could not read the request body

Config errors (startup) include field, got, and hint:

╔═══════════════════════════════════════╗
║       Configuration Error             ║
╚═══════════════════════════════════════╝

  ✗  must be a valid integer (field: GATEWAY_PORT, got: "abc")

  💡 example: GATEWAY_PORT=4000

Examples

See examples/ directory:

  • curl_examples.sh — bash/curl quick tests
  • python_openai.py — Python (OpenAI SDK, streaming, per-key override, Ollama)
  • node_example.js — Node.js (OpenAI SDK)
# Python
pip install openai
python examples/python_openai.py openai
python examples/python_openai.py stream
python examples/python_openai.py health

# curl
bash examples/curl_examples.sh

Docker

docker-compose up

# Or with your .env file
docker run --env-file .env -p 4000:4000 gateway:latest

Architecture

Client → GATEWAY → Provider (OpenAI / Anthropic / Groq / ...)
          │
          ├── .env loader (with line-level error reporting)
          ├── LRU cache (in-memory, zero deps)
          ├── Retry with exponential backoff
          ├── Token extraction (all provider formats)
          ├── Cost estimation
          ├── Prometheus metrics
          └── JSONL request log

Supported providers

Provider Path Key env var
OpenAI /openai OPENAI_API_KEY
Anthropic /anthropic ANTHROPIC_API_KEY
Google Gemini /gemini GEMINI_API_KEY
Groq /groq GROQ_API_KEY
Mistral /mistral MISTRAL_API_KEY
Together AI /together TOGETHER_API_KEY
DeepSeek /deepseek DEEPSEEK_API_KEY
xAI / Grok /xai XAI_API_KEY
Fireworks /fireworks FIREWORKS_API_KEY
Perplexity /perplexity PERPLEXITY_API_KEY
Cohere /cohere COHERE_API_KEY
OpenRouter /openrouter OPENROUTER_API_KEY
Azure OpenAI /azure AZURE_OPENAI_API_KEY
AWS Bedrock /bedrock AWS_ACCESS_KEY_ID
Anyscale /anyscale ANYSCALE_API_KEY
Ollama (local) /ollama OLLAMA_BASE_URL

About

Blazing-fast AI gateway in Go. Sub-millisecond overhead. Zero external dependencies. Drop-in for any LLM provider.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors