feat: Gemini embedding provider + LAT_LLM_ENDPOINT for custom servers#41
Open
kgray-wasteology wants to merge 1 commit into1st1:mainfrom
Open
feat: Gemini embedding provider + LAT_LLM_ENDPOINT for custom servers#41kgray-wasteology wants to merge 1 commit into1st1:mainfrom
kgray-wasteology wants to merge 1 commit into1st1:mainfrom
Conversation
…penAI-compatible servers Adds two new ways to configure embedding providers: 1. Gemini auto-detection: API keys starting with `AIza` automatically route to Google's OpenAI-compatible embedding endpoint using gemini-embedding-001. Gemini embeddings are free-tier (1,500 req/min), lowering the barrier for users without an OpenAI key. 2. LAT_LLM_ENDPOINT: A generic escape hatch for any OpenAI-compatible embedding server (LM Studio, Ollama, vLLM, etc.). When set, it takes highest priority over key-prefix detection. Pair with LAT_LLM_MODEL to override the model name. Also passes `dimensions` in the embedding request body so providers that default to higher dimensions (e.g. Gemini's 3072) truncate to the expected 1536. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
@kgray-wasteology is attempting to deploy a commit to the Yury Selivanov's projects Team on Vercel. A member of the Team first needs to authorize it. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #40
Summary
AIzakey prefix → routes to Google's OpenAI-compatible embedding endpoint (gemini-embedding-001, free tier)LAT_LLM_ENDPOINT: Generic escape hatch for any OpenAI-compatible embedding server (Ollama, LM Studio, vLLM). Takes highest priority when set. Pair withLAT_LLM_MODELto override model name.dimensionsin request body: Ensures providers that default to higher dimensions (Gemini → 3072) truncate to the expected 1536. Also correct for OpenAI.Changes
src/search/provider.tsgeminiprovider,customProvider()function,LAT_LLM_ENDPOINT/LAT_LLM_MODELenv var support indetectProvider()src/search/embeddings.tsdimensionsin embedding request bodytests/search.test.tsDesign notes
Test plan
npm run typecheckpassesnpx vitest run tests/search.test.ts— all 14 tests pass (8 existing + 6 new)pnpm not foundfailures unchangedLAT_LLM_KEY=$GEMINI_API_KEY lat search "query"