Skip to content

feat: Gemini embedding provider + LAT_LLM_ENDPOINT for custom servers#41

Open
kgray-wasteology wants to merge 1 commit into1st1:mainfrom
kgray-wasteology:feat/gemini-and-custom-endpoint
Open

feat: Gemini embedding provider + LAT_LLM_ENDPOINT for custom servers#41
kgray-wasteology wants to merge 1 commit into1st1:mainfrom
kgray-wasteology:feat/gemini-and-custom-endpoint

Conversation

@kgray-wasteology
Copy link
Copy Markdown

Closes #40

Summary

  • Gemini provider: Auto-detects AIza key prefix → routes to Google's OpenAI-compatible embedding endpoint (gemini-embedding-001, free tier)
  • LAT_LLM_ENDPOINT: Generic escape hatch for any OpenAI-compatible embedding server (Ollama, LM Studio, vLLM). Takes highest priority when set. Pair with LAT_LLM_MODEL to override model name.
  • dimensions in request body: Ensures providers that default to higher dimensions (Gemini → 3072) truncate to the expected 1536. Also correct for OpenAI.

Changes

File Change
src/search/provider.ts Add gemini provider, customProvider() function, LAT_LLM_ENDPOINT/LAT_LLM_MODEL env var support in detectProvider()
src/search/embeddings.ts Pass dimensions in embedding request body
tests/search.test.ts 6 new tests: Gemini detection, endpoint override, trailing slash handling, custom provider

Design notes

Test plan

  • npm run typecheck passes
  • npx vitest run tests/search.test.ts — all 14 tests pass (8 existing + 6 new)
  • Full test suite: 158 passed, 2 pre-existing pnpm not found failures unchanged
  • Manually verified Gemini semantic search works with LAT_LLM_KEY=$GEMINI_API_KEY lat search "query"

…penAI-compatible servers

Adds two new ways to configure embedding providers:

1. Gemini auto-detection: API keys starting with `AIza` automatically
   route to Google's OpenAI-compatible embedding endpoint using
   gemini-embedding-001. Gemini embeddings are free-tier (1,500 req/min),
   lowering the barrier for users without an OpenAI key.

2. LAT_LLM_ENDPOINT: A generic escape hatch for any OpenAI-compatible
   embedding server (LM Studio, Ollama, vLLM, etc.). When set, it takes
   highest priority over key-prefix detection. Pair with LAT_LLM_MODEL
   to override the model name.

Also passes `dimensions` in the embedding request body so providers that
default to higher dimensions (e.g. Gemini's 3072) truncate to the
expected 1536.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@vercel
Copy link
Copy Markdown

vercel bot commented Mar 25, 2026

@kgray-wasteology is attempting to deploy a commit to the Yury Selivanov's projects Team on Vercel.

A member of the Team first needs to authorize it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat: Gemini embedding provider + LAT_LLM_ENDPOINT for custom servers

1 participant