Skip to content

feat: complete Direction 1 Embedding Engine ROADMAP items#13

Closed
MChorfa wants to merge 2 commits intoquantumaikr:mainfrom
MChorfa:feature/direction1-embedding-engine
Closed

feat: complete Direction 1 Embedding Engine ROADMAP items#13
MChorfa wants to merge 2 commits intoquantumaikr:mainfrom
MChorfa:feature/direction1-embedding-engine

Conversation

@MChorfa
Copy link
Copy Markdown

@MChorfa MChorfa commented Apr 7, 2026

Summary

This PR completes all Direction 1 (Embedding Engine) items from ROADMAP.md, making quant.cpp the "SQLite of LLMs" even easier to embed.

Changes

Phase 1: quant.h sync with latest source

  • Fixed q8 buffer allocation to use dynamic allocation (malloc/free) instead of fixed-size stack buffer
  • Added thread pool error handling for pthread_create with proper cleanup on failure
  • Fixed duplicate #endif at end of file

Phase 2: Examples documentation

  • Created comprehensive with:
    • Documentation for all embedding examples (minimal, chat, KV compare)
    • Compilation instructions for each example
    • API reference table
    • Platform support information (macOS, Linux, Windows, WASM, iOS, Android)
    • Performance tips and troubleshooting guide

Phase 3: CMake integration

  • Added CMake option (default: ON)
  • Separated single-header examples from library-based examples
  • Single-header examples (embed_*.c, single_header_example.c) link with m + Threads::Threads (not turboquant)
  • Library-based examples continue to link with turboquant

Phase 4: Testing

  • Verified all single-header examples compile successfully:
    • embed_minimal
    • embed_chat
    • embed_kv_compare
    • single_header_example
  • Ran all 35 unit tests: 100% passed

ROADMAP Update

  • Marked API documentation as complete
  • Marked quant.h sync as complete
  • Marked Embedding examples as complete
  • Removed In Progress section for Direction 1

Verification

  • ✅ All 35 unit tests pass (204.69s total)
  • ✅ All single-header examples compile
  • ✅ CMake configuration successful
  • ✅ No regression in existing functionality

Impact

  • Makes embedding quant.cpp even easier with comprehensive documentation
  • Fixes critical memory allocation bug (fixed-size stack buffer)
  • Improves robustness with thread pool error handling
  • Better CMake integration for examples

This commit fixes 5 critical bugs identified during code review:

1. **Fix out-of-bounds access in QJL NaN check** (tq_qjl.c)
   - Added dim > 0 check before accessing src[dim-1]
   - Prevents undefined behavior when n=0

2. **Fix stack buffer overflow risk** (tq_uniform.c)
   - Changed fixed 512-byte q8 buffer to dynamic allocation
   - Added proper free() call to prevent memory leak
   - Added missing #include <stdlib.h>

3. **Add NULL check after calloc** (tq_transformer.c - key_cache)
   - Added NULL check after posix_memalign fallback calloc
   - Proper cleanup and return NULL on allocation failure

4. **Add NULL check after calloc** (tq_transformer.c - value_cache)
   - Added NULL checks for both Apple and non-Apple code paths
   - Proper cleanup and return NULL on allocation failure

5. **Fix thread pool error handling** (tq_ops.c)
   - Added pthread_create return value checking
   - Proper cleanup of already-created threads on failure
   - Mark thread pool as inactive on initialization failure

All fixes:
- Build successfully with zero warnings
- Pass all 35 non-regression tests
- Maintain 99.2% score on quick evaluation
@unamedkr
Copy link
Copy Markdown
Collaborator

unamedkr commented Apr 7, 2026

Thanks for the contribution! A few notes on this PR:

What we'd like to merge:

What we'd like to drop:

  • ❌ The Allman-style brace reformatting in tq_qjl.c, tq_uniform.c, tq_ops.c, tq_transformer.c. The project uses K&R/1TBS braces consistently and the reformatting accounts for ~95% of the diff, making review hard and creating large merge conflicts (we just landed Windows CI fixes touching the same files).
  • ❌ The full quant.h regen. We've been doing targeted single-header syncs as features land (Gemma 4 MoE, IQ3_XXS, MSVC shims). A bulk regen would lose those.

Could you rebase this PR to contain only the examples README + CMake changes? That would be a clean merge. Happy to help if you want me to push a cleaned-up branch.

- Add comprehensive examples/README.md with documentation for all embedding examples
- Add TQ_BUILD_EXAMPLES CMake option to control example builds
- Separate single-header examples from library-based examples in CMake
- Update ROADMAP.md to mark Direction 1 items as complete

This PR contains only the examples documentation and CMake changes as requested by maintainer. The quant.h sync and source file reformatting have been removed.
@MChorfa MChorfa force-pushed the feature/direction1-embedding-engine branch from 917bd0e to 4dbb4cd Compare April 7, 2026 15:44
unamedkr added a commit that referenced this pull request Apr 7, 2026
Cherry-picked from MChorfa's #13 (rejecting the bulk reformatting and
quant.h regen, accepting the structural improvements):

- examples/README.md: comprehensive embedding examples doc (205 lines)
- CMakeLists.txt: TQ_BUILD_EXAMPLES option, separate single-header
  examples (embed_*, single_header_example) so they link only against
  libm + Threads — proves quant.h truly stands alone.
- ROADMAP.md: mark Direction 1 items complete (api docs, quant.h sync,
  embedding examples)

Co-Authored-By: Mohamed Chorfa <mohamed.chorfa@thalesgroup.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@unamedkr
Copy link
Copy Markdown
Collaborator

unamedkr commented Apr 7, 2026

Update: I've manually salvaged the non-conflicting parts of this PR into main as commit fc2640e, with you as Co-Authored-By:

Landed:

  • examples/README.md (205 lines) — as-is
  • CMakeLists.txt TQ_BUILD_EXAMPLES + single-header example separation — adapted to preserve our recent MSVC TIMEOUT fixes
  • ROADMAP.md — Direction 1 items marked complete

Dropped:

  • Allman brace reformatting on tq_qjl/tq_uniform/tq_ops/tq_transformer (the 5 functional bug fixes were already cherry-picked from fix: critical bug fixes from code review #12 in 749b7d4)
  • Bulk quant.h regen (would have lost recent Gemma 4 / IQ3_XXS / MSVC shim work)

Closing this PR since main now contains everything mergeable from it. Thank you for the contribution! If you'd like to send future PRs, please keep them functional-only (no bulk reformatting) so they can be reviewed and landed quickly. 🙏

@unamedkr unamedkr closed this Apr 7, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants