Skip to content

[LEADS-232] Embedding caching handling with ragas 0.4#199

Open
bsatapat-jpg wants to merge 1 commit intolightspeed-core:mainfrom
bsatapat-jpg:dev_1
Open

[LEADS-232] Embedding caching handling with ragas 0.4#199
bsatapat-jpg wants to merge 1 commit intolightspeed-core:mainfrom
bsatapat-jpg:dev_1

Conversation

@bsatapat-jpg
Copy link
Copy Markdown
Collaborator

@bsatapat-jpg bsatapat-jpg commented Mar 27, 2026

Description

The previous implementation used a custom CachedEmbedding class that wrapped LangChain embeddings. This approach was incompatible with Ragas 0.4+ which uses BaseRagasEmbedding interface and embedding_factory().

What this PR adds:

  • New CachedRagasEmbedding class that implements BaseRagasEmbedding interface
  • Wraps the embeddings created by embedding_factory("litellm", ...)
  • Uses diskcache for persistent disk caching (same library used elsewhere in the project)

Type of change

  • Refactor
  • New feature
  • Bug fix
  • CVE fix
  • Optimization
  • Documentation Update
  • Configuration Update
  • Bump-up service version
  • Bump-up dependent library
  • Bump-up library or tool used for development (does not change the final image)
  • CI configuration change
  • Unit tests improvement

Tools used to create PR

Identify any AI code assistants used in this PR (for transparency and review context)

  • Assisted-by: Claude-4.5-opus-high
  • Generated by: Cursor

Related Tickets & Documents

Checklist before requesting a review

  • I have performed a self-review of my code.
  • PR has passed all pre-merge test jobs.
  • If it is a core feature, I have added thorough tests.

Testing

  • Please provide detailed steps to perform tests related to this code change.
  • How were the fix/results from this change verified? Please provide relevant screenshots or results.

Summary by CodeRabbit

  • Bug Fixes
    • Enhanced cache initialization to independently enable caching mechanisms based on separate configuration flags, ensuring proper support for both language model and embedding cache types.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Mar 27, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 3974071c-1366-4a0f-9e74-290b1606d774

📥 Commits

Reviewing files that changed from the base of the PR and between c3ab5ec and 20a1b65.

📒 Files selected for processing (1)
  • src/lightspeed_evaluation/core/metrics/ragas.py
🚧 Files skipped from review as they are similar to previous changes (1)
  • src/lightspeed_evaluation/core/metrics/ragas.py

Walkthrough

The cache initialization logic for litellm.cache now depends on two independent feature flags (LLM and embedding caching enablement) instead of one. DISK cache is enabled when either flag is active. The code additionally computes and passes supported_call_types to the cache based on which cache types are enabled.

Changes

Cohort / File(s) Summary
Cache initialization refactoring
src/lightspeed_evaluation/core/metrics/ragas.py
Modified cache initialization to respect both LLM and embedding cache feature flags independently. Added logic to compute supported_call_types based on enabled cache features (completion/acompletion for LLM caching, embedding/aembedding for embedding caching).

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and specifically addresses the main change: implementing embedding caching support compatible with Ragas 0.4, which is directly reflected in the PR's core objective of replacing an incompatible wrapper with a new CachedRagasEmbedding class.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (3)
src/lightspeed_evaluation/core/embedding/ragas.py (3)

33-33: Consider adding cache cleanup support.

The diskcache.Cache instance is created but never explicitly closed. While diskcache handles this gracefully on garbage collection, long-running processes may benefit from explicit cleanup. Consider implementing __del__ or providing a close() method that downstream code can call.

♻️ Optional: Add cleanup method
     def __init__(self, embeddings: BaseRagasEmbedding, cache_dir: str, model_name: str):
         ...
         self._cache: Cache = Cache(cache_dir)
         ...
+
+    def close(self) -> None:
+        """Close the disk cache to release resources."""
+        self._cache.close()
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/lightspeed_evaluation/core/embedding/ragas.py` at line 33, The Cache
instance assigned to self._cache is never explicitly closed; add explicit
cleanup by implementing a public close() method that calls self._cache.close()
(and handles repeated calls idempotently), and optionally implement __del__ to
call close() so long-running processes can release resources; update the class
that creates self._cache (the one in ragas.py where "self._cache: Cache =
Cache(cache_dir)" appears) to include these methods and ensure any external
callers can call close() when finished.

67-89: Blocking cache operations in async method may impact event loop performance.

The aembed_text method uses synchronous self._cache.get and self._cache.set operations, which perform blocking I/O. In high-concurrency async workloads, this could briefly block the event loop. For typical usage with fast cache lookups, this is likely acceptable, but consider wrapping in asyncio.to_thread if performance issues arise.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/lightspeed_evaluation/core/embedding/ragas.py` around lines 67 - 89, The
async method aembed_text currently calls the synchronous cache methods
self._cache.get and self._cache.set which can block the event loop; change those
calls to run in a thread (e.g., await asyncio.to_thread(self._cache.get,
cache_key) and await asyncio.to_thread(self._cache.set, cache_key, result)) and
ensure asyncio is imported so aembed_text awaits the threaded calls around cache
access while leaving the actual await self._embeddings.aembed_text call
unchanged.

59-62: Inconsistent log levels between cache hit and miss.

Cache hit uses debug level (line 59) while cache miss uses info level (line 62). This inconsistency makes it harder to track cache behavior at a single log level. Consider using the same level for both, or using debug for hits and info for misses consistently across both sync and async methods (note: async method at line 81 uses info for hits).

♻️ Align log levels
         if cached_result is not None:
-            logger.debug("Embedding CACHE HIT for text (len=%d)", len(text))
+            logger.info("Embedding CACHE HIT for text (len=%d)", len(text))
             return cached_result
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/lightspeed_evaluation/core/embedding/ragas.py` around lines 59 - 62, The
cache logging is inconsistent: the sync method logs "Embedding CACHE HIT for
text (len=%d)" at logger.debug while the miss logs "Embedding CACHE MISS for
text (len=%d) - calling API" at logger.info, and the async method logs hits at
info; make these consistent by choosing a single level (recommended:
logger.debug for hits and logger.info for misses) and update all occurrences of
the strings "Embedding CACHE HIT for text (len=%d)" and "Embedding CACHE MISS
for text (len=%d) - calling API" (both sync and async embedding methods that
reference logger and cached_result/text) so hits and misses use the agreed
levels across the file.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/lightspeed_evaluation/core/embedding/ragas.py`:
- Around line 37-65: The cache key must include kwargs to avoid collisions:
change _get_cache_key to accept an additional kwargs: Dict[str, Any] parameter
and incorporate a deterministic serialization of kwargs (e.g.,
json.dumps(kwargs, sort_keys=True, separators=(',',':'))), falling back to a
stable repr for non-serializable values, then combine/sha256-hash that
serialization with self._model_name and the text to produce the final key;
update embed_text to call _get_cache_key(text, kwargs) before cache lookup and
ensure the same key generation is used wherever _get_cache_key/_cache are used
(symbols: _get_cache_key, embed_text, _model_name, _embeddings, _cache).

---

Nitpick comments:
In `@src/lightspeed_evaluation/core/embedding/ragas.py`:
- Line 33: The Cache instance assigned to self._cache is never explicitly
closed; add explicit cleanup by implementing a public close() method that calls
self._cache.close() (and handles repeated calls idempotently), and optionally
implement __del__ to call close() so long-running processes can release
resources; update the class that creates self._cache (the one in ragas.py where
"self._cache: Cache = Cache(cache_dir)" appears) to include these methods and
ensure any external callers can call close() when finished.
- Around line 67-89: The async method aembed_text currently calls the
synchronous cache methods self._cache.get and self._cache.set which can block
the event loop; change those calls to run in a thread (e.g., await
asyncio.to_thread(self._cache.get, cache_key) and await
asyncio.to_thread(self._cache.set, cache_key, result)) and ensure asyncio is
imported so aembed_text awaits the threaded calls around cache access while
leaving the actual await self._embeddings.aembed_text call unchanged.
- Around line 59-62: The cache logging is inconsistent: the sync method logs
"Embedding CACHE HIT for text (len=%d)" at logger.debug while the miss logs
"Embedding CACHE MISS for text (len=%d) - calling API" at logger.info, and the
async method logs hits at info; make these consistent by choosing a single level
(recommended: logger.debug for hits and logger.info for misses) and update all
occurrences of the strings "Embedding CACHE HIT for text (len=%d)" and
"Embedding CACHE MISS for text (len=%d) - calling API" (both sync and async
embedding methods that reference logger and cached_result/text) so hits and
misses use the agreed levels across the file.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: f7d5cb5d-085a-4cb3-9b60-b3e491554952

📥 Commits

Reviewing files that changed from the base of the PR and between e978f08 and 03b081b.

📒 Files selected for processing (1)
  • src/lightspeed_evaluation/core/embedding/ragas.py

@asamal4 asamal4 added draft bug Something isn't working labels Mar 27, 2026
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/lightspeed_evaluation/core/metrics/ragas.py`:
- Around line 55-66: When building litellm.cache, respect the per-feature flags:
only include completion types ("completion","acompletion") in
supported_call_types when llm_cache_enabled is true, and only include embedding
types ("embedding","aembedding") when embedding_cache_enabled is true; if
neither flag is set, do not create the Cache. Also choose the disk_cache_dir
based on which feature is enabled (use llm_manager.get_config().cache_dir when
llm_cache_enabled, embedding_manager.get_config().cache_dir when only
embedding_cache_enabled, and prefer the LLM dir or a sensible common dir when
both are enabled). Update the Cache construction (Cache, LiteLLMCacheType.DISK,
supported_call_types, disk_cache_dir) accordingly and keep the existing
litellm.cache guard.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 4981aacc-23f7-48d5-8c01-d2e7cbfecc45

📥 Commits

Reviewing files that changed from the base of the PR and between 9fb2439 and 64f2f2b.

📒 Files selected for processing (2)
  • src/lightspeed_evaluation/core/embedding/ragas.py
  • src/lightspeed_evaluation/core/metrics/ragas.py
✅ Files skipped from review due to trivial changes (1)
  • src/lightspeed_evaluation/core/embedding/ragas.py

@xmican10
Copy link
Copy Markdown
Collaborator

xmican10 commented Mar 30, 2026

@bsatapat-jpg this seems like a robust solution.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working draft

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants