[LEADS-232] Embedding caching handling with ragas 0.4 by bsatapat-jpg · Pull Request #199 · lightspeed-core/lightspeed-evaluation

bsatapat-jpg · 2026-03-27T06:20:40Z

Description

The previous implementation used a custom CachedEmbedding class that wrapped LangChain embeddings. This approach was incompatible with Ragas 0.4+ which uses BaseRagasEmbedding interface and embedding_factory().

What this PR adds:

New CachedRagasEmbedding class that implements BaseRagasEmbedding interface
Wraps the embeddings created by embedding_factory("litellm", ...)
Uses diskcache for persistent disk caching (same library used elsewhere in the project)

Type of change

Tools used to create PR

Identify any AI code assistants used in this PR (for transparency and review context)

Assisted-by: Claude-4.5-opus-high
Generated by: Cursor

Related Tickets & Documents

Related Issue # [LEADS-232] Bump up dependencies for Python 3.11, 3.12, 3.13 support #189
Closes #

Checklist before requesting a review

I have performed a self-review of my code.
PR has passed all pre-merge test jobs.
If it is a core feature, I have added thorough tests.

Testing

Please provide detailed steps to perform tests related to this code change.
How were the fix/results from this change verified? Please provide relevant screenshots or results.

Summary by CodeRabbit

Bug Fixes
- Enhanced cache initialization to independently enable caching mechanisms based on separate configuration flags, ensuring proper support for both language model and embedding cache types.

coderabbitai · 2026-03-27T06:21:02Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 3974071c-1366-4a0f-9e74-290b1606d774

📥 Commits

Reviewing files that changed from the base of the PR and between c3ab5ec and 20a1b65.

📒 Files selected for processing (1)

src/lightspeed_evaluation/core/metrics/ragas.py

🚧 Files skipped from review as they are similar to previous changes (1)

src/lightspeed_evaluation/core/metrics/ragas.py

Walkthrough

The cache initialization logic for litellm.cache now depends on two independent feature flags (LLM and embedding caching enablement) instead of one. DISK cache is enabled when either flag is active. The code additionally computes and passes supported_call_types to the cache based on which cache types are enabled.

Changes

Cohort / File(s)	Summary
Cache initialization refactoring `src/lightspeed_evaluation/core/metrics/ragas.py`	Modified cache initialization to respect both LLM and embedding cache feature flags independently. Added logic to compute `supported_call_types` based on enabled cache features (completion/acompletion for LLM caching, embedding/aembedding for embedding caching).

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

🚥 Pre-merge checks | ✅ 3

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title clearly and specifically addresses the main change: implementing embedding caching support compatible with Ragas 0.4, which is directly reflected in the PR's core objective of replacing an incompatible wrapper with a new CachedRagasEmbedding class.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (3)

src/lightspeed_evaluation/core/embedding/ragas.py (3)
33-33: Consider adding cache cleanup support.

The diskcache.Cache instance is created but never explicitly closed. While diskcache handles this gracefully on garbage collection, long-running processes may benefit from explicit cleanup. Consider implementing __del__ or providing a close() method that downstream code can call.
♻️ Optional: Add cleanup method
     def __init__(self, embeddings: BaseRagasEmbedding, cache_dir: str, model_name: str):
         ...
         self._cache: Cache = Cache(cache_dir)
         ...
+
+    def close(self) -> None:
+        """Close the disk cache to release resources."""
+        self._cache.close()
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/lightspeed_evaluation/core/embedding/ragas.py` at line 33, The Cache
instance assigned to self._cache is never explicitly closed; add explicit
cleanup by implementing a public close() method that calls self._cache.close()
(and handles repeated calls idempotently), and optionally implement __del__ to
call close() so long-running processes can release resources; update the class
that creates self._cache (the one in ragas.py where "self._cache: Cache =
Cache(cache_dir)" appears) to include these methods and ensure any external
callers can call close() when finished.
67-89: Blocking cache operations in async method may impact event loop performance.

The aembed_text method uses synchronous self._cache.get and self._cache.set operations, which perform blocking I/O. In high-concurrency async workloads, this could briefly block the event loop. For typical usage with fast cache lookups, this is likely acceptable, but consider wrapping in asyncio.to_thread if performance issues arise.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/lightspeed_evaluation/core/embedding/ragas.py` around lines 67 - 89, The
async method aembed_text currently calls the synchronous cache methods
self._cache.get and self._cache.set which can block the event loop; change those
calls to run in a thread (e.g., await asyncio.to_thread(self._cache.get,
cache_key) and await asyncio.to_thread(self._cache.set, cache_key, result)) and
ensure asyncio is imported so aembed_text awaits the threaded calls around cache
access while leaving the actual await self._embeddings.aembed_text call
unchanged.
59-62: Inconsistent log levels between cache hit and miss.

Cache hit uses debug level (line 59) while cache miss uses info level (line 62). This inconsistency makes it harder to track cache behavior at a single log level. Consider using the same level for both, or using debug for hits and info for misses consistently across both sync and async methods (note: async method at line 81 uses info for hits).
♻️ Align log levels
         if cached_result is not None:
-            logger.debug("Embedding CACHE HIT for text (len=%d)", len(text))
+            logger.info("Embedding CACHE HIT for text (len=%d)", len(text))
             return cached_result
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/lightspeed_evaluation/core/embedding/ragas.py` around lines 59 - 62, The
cache logging is inconsistent: the sync method logs "Embedding CACHE HIT for
text (len=%d)" at logger.debug while the miss logs "Embedding CACHE MISS for
text (len=%d) - calling API" at logger.info, and the async method logs hits at
info; make these consistent by choosing a single level (recommended:
logger.debug for hits and logger.info for misses) and update all occurrences of
the strings "Embedding CACHE HIT for text (len=%d)" and "Embedding CACHE MISS
for text (len=%d) - calling API" (both sync and async embedding methods that
reference logger and cached_result/text) so hits and misses use the agreed
levels across the file.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/lightspeed_evaluation/core/embedding/ragas.py`:
- Around line 37-65: The cache key must include kwargs to avoid collisions:
change _get_cache_key to accept an additional kwargs: Dict[str, Any] parameter
and incorporate a deterministic serialization of kwargs (e.g.,
json.dumps(kwargs, sort_keys=True, separators=(',',':'))), falling back to a
stable repr for non-serializable values, then combine/sha256-hash that
serialization with self._model_name and the text to produce the final key;
update embed_text to call _get_cache_key(text, kwargs) before cache lookup and
ensure the same key generation is used wherever _get_cache_key/_cache are used
(symbols: _get_cache_key, embed_text, _model_name, _embeddings, _cache).

---

Nitpick comments:
In `@src/lightspeed_evaluation/core/embedding/ragas.py`:
- Line 33: The Cache instance assigned to self._cache is never explicitly
closed; add explicit cleanup by implementing a public close() method that calls
self._cache.close() (and handles repeated calls idempotently), and optionally
implement __del__ to call close() so long-running processes can release
resources; update the class that creates self._cache (the one in ragas.py where
"self._cache: Cache = Cache(cache_dir)" appears) to include these methods and
ensure any external callers can call close() when finished.
- Around line 67-89: The async method aembed_text currently calls the
synchronous cache methods self._cache.get and self._cache.set which can block
the event loop; change those calls to run in a thread (e.g., await
asyncio.to_thread(self._cache.get, cache_key) and await
asyncio.to_thread(self._cache.set, cache_key, result)) and ensure asyncio is
imported so aembed_text awaits the threaded calls around cache access while
leaving the actual await self._embeddings.aembed_text call unchanged.
- Around line 59-62: The cache logging is inconsistent: the sync method logs
"Embedding CACHE HIT for text (len=%d)" at logger.debug while the miss logs
"Embedding CACHE MISS for text (len=%d) - calling API" at logger.info, and the
async method logs hits at info; make these consistent by choosing a single level
(recommended: logger.debug for hits and logger.info for misses) and update all
occurrences of the strings "Embedding CACHE HIT for text (len=%d)" and
"Embedding CACHE MISS for text (len=%d) - calling API" (both sync and async
embedding methods that reference logger and cached_result/text) so hits and
misses use the agreed levels across the file.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: f7d5cb5d-085a-4cb3-9b60-b3e491554952

📥 Commits

Reviewing files that changed from the base of the PR and between e978f08 and 03b081b.

📒 Files selected for processing (1)

src/lightspeed_evaluation/core/embedding/ragas.py

src/lightspeed_evaluation/core/embedding/ragas.py

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/lightspeed_evaluation/core/metrics/ragas.py`:
- Around line 55-66: When building litellm.cache, respect the per-feature flags:
only include completion types ("completion","acompletion") in
supported_call_types when llm_cache_enabled is true, and only include embedding
types ("embedding","aembedding") when embedding_cache_enabled is true; if
neither flag is set, do not create the Cache. Also choose the disk_cache_dir
based on which feature is enabled (use llm_manager.get_config().cache_dir when
llm_cache_enabled, embedding_manager.get_config().cache_dir when only
embedding_cache_enabled, and prefer the LLM dir or a sensible common dir when
both are enabled). Update the Cache construction (Cache, LiteLLMCacheType.DISK,
supported_call_types, disk_cache_dir) accordingly and keep the existing
litellm.cache guard.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 4981aacc-23f7-48d5-8c01-d2e7cbfecc45

📥 Commits

Reviewing files that changed from the base of the PR and between 9fb2439 and 64f2f2b.

📒 Files selected for processing (2)

src/lightspeed_evaluation/core/embedding/ragas.py
src/lightspeed_evaluation/core/metrics/ragas.py

✅ Files skipped from review due to trivial changes (1)

src/lightspeed_evaluation/core/embedding/ragas.py

src/lightspeed_evaluation/core/metrics/ragas.py

xmican10 · 2026-03-30T12:53:28Z

@bsatapat-jpg this seems like a robust solution.

coderabbitai bot reviewed Mar 27, 2026

View reviewed changes

src/lightspeed_evaluation/core/embedding/ragas.py Outdated Show resolved Hide resolved

bsatapat-jpg force-pushed the dev_1 branch from 03b081b to 9fb2439 Compare March 27, 2026 06:34

asamal4 reviewed Mar 27, 2026

View reviewed changes

src/lightspeed_evaluation/core/embedding/ragas.py Outdated Show resolved Hide resolved

asamal4 added draft bug Something isn't working labels Mar 27, 2026

bsatapat-jpg force-pushed the dev_1 branch from 9fb2439 to 64f2f2b Compare March 30, 2026 10:00

coderabbitai bot reviewed Mar 30, 2026

View reviewed changes

src/lightspeed_evaluation/core/metrics/ragas.py Show resolved Hide resolved

bsatapat-jpg force-pushed the dev_1 branch from 64f2f2b to c3ab5ec Compare March 30, 2026 11:15

[LEADS-232] Embedding caching handling with ragas 0.4

20a1b65

bsatapat-jpg force-pushed the dev_1 branch from c3ab5ec to 20a1b65 Compare March 30, 2026 11:21

bsatapat-jpg requested a review from asamal4 March 30, 2026 11:49

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[LEADS-232] Embedding caching handling with ragas 0.4#199

[LEADS-232] Embedding caching handling with ragas 0.4#199
bsatapat-jpg wants to merge 1 commit intolightspeed-core:mainfrom
bsatapat-jpg:dev_1

bsatapat-jpg commented Mar 27, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Mar 27, 2026 •

edited

Loading

Reviews paused

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

xmican10 commented Mar 30, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

bsatapat-jpg commented Mar 27, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

What this PR adds:

Type of change

Tools used to create PR

Related Tickets & Documents

Checklist before requesting a review

Testing

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Mar 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

Estimated code review effort

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

xmican10 commented Mar 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

bsatapat-jpg commented Mar 27, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Mar 27, 2026 •

edited

Loading

xmican10 commented Mar 30, 2026 •

edited

Loading