-
Notifications
You must be signed in to change notification settings - Fork 1
Pure Go word2vec/hugot embedding upgrade path #373
Copy link
Copy link
Open
Labels
component:llmLLM provider layerLLM provider layerenhancementNew feature or requestNew feature or request
Description
Summary
Research and prototype higher-quality pure Go embedding options that don't require ONNX Runtime or any shared library — keeping the single-binary, zero-dependency promise.
Motivation
The embedding quality ladder for mnemonic:
- bow-128 (current) — zero deps, fast, coarse
- ONNX MiniLM (Embedded ONNX embedding provider (MiniLM-L6-v2 INT8) #370) — great quality, requires shared library
- This issue — middle ground: better than bow, no shared library
Two promising approaches emerged from research:
Option A: Word2Vec (50K words, 100d)
- Library:
github.com/sajari/word2vec(pure Go, loads binary Word2Vec models) - Ship a pruned 50K-word, 100d model (~20MB, go:embed feasible)
- Sentence embedding: average word vectors
- Quality: ~58-65 STS Spearman (vs bow ~40, transformer ~84)
- Speed: 10K-50K embeddings/sec
Option B: hugot pure Go transformers
- Library:
github.com/knights-analytics/hugot - Pure Go ONNX runtime backend (no CGo, no shared library)
- Can run MiniLM-L6-v2 entirely in Go
- Quality: transformer-level (~84 STS)
- Speed: slower than C ONNX Runtime, but functional
- Maturity: newer, less battle-tested
Research Tasks
- Benchmark
sajari/word2vecwith pruned GloVe/fasttext model on mnemonic's retrieval benchmark - Benchmark
knights-analytics/hugotpure Go backend with MiniLM-L6-v2 - Measure: latency, memory footprint, binary size impact, retrieval quality (nDCG@5)
- Compare both against bow-128 baseline and ONNX MiniLM
- Determine if either is production-ready for mnemonic
Decision Criteria
Pick the approach that best satisfies:
- Single binary (no shared libraries)
- <10ms embedding latency on CPU
- <50MB binary size increase
- Measurable retrieval quality improvement over bow-128
- Cross-platform (Linux, macOS ARM, Windows)
References
- sajari/word2vec: https://github.com/sajari/word2vec
- knights-analytics/hugot: https://github.com/knights-analytics/hugot
- GloVe pretrained: https://nlp.stanford.edu/projects/glove/
- Parent: Remove LLM dependency from cognitive pipeline — heuristic-first architecture #369
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
component:llmLLM provider layerLLM provider layerenhancementNew feature or requestNew feature or request