[LEADS-232] Embedding caching handling with ragas 0.4#199
[LEADS-232] Embedding caching handling with ragas 0.4#199bsatapat-jpg wants to merge 1 commit intolightspeed-core:mainfrom
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
WalkthroughThe cache initialization logic for Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes 🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (3)
src/lightspeed_evaluation/core/embedding/ragas.py (3)
33-33: Consider adding cache cleanup support.The
diskcache.Cacheinstance is created but never explicitly closed. Whilediskcachehandles this gracefully on garbage collection, long-running processes may benefit from explicit cleanup. Consider implementing__del__or providing aclose()method that downstream code can call.♻️ Optional: Add cleanup method
def __init__(self, embeddings: BaseRagasEmbedding, cache_dir: str, model_name: str): ... self._cache: Cache = Cache(cache_dir) ... + + def close(self) -> None: + """Close the disk cache to release resources.""" + self._cache.close()🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/lightspeed_evaluation/core/embedding/ragas.py` at line 33, The Cache instance assigned to self._cache is never explicitly closed; add explicit cleanup by implementing a public close() method that calls self._cache.close() (and handles repeated calls idempotently), and optionally implement __del__ to call close() so long-running processes can release resources; update the class that creates self._cache (the one in ragas.py where "self._cache: Cache = Cache(cache_dir)" appears) to include these methods and ensure any external callers can call close() when finished.
67-89: Blocking cache operations in async method may impact event loop performance.The
aembed_textmethod uses synchronousself._cache.getandself._cache.setoperations, which perform blocking I/O. In high-concurrency async workloads, this could briefly block the event loop. For typical usage with fast cache lookups, this is likely acceptable, but consider wrapping inasyncio.to_threadif performance issues arise.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/lightspeed_evaluation/core/embedding/ragas.py` around lines 67 - 89, The async method aembed_text currently calls the synchronous cache methods self._cache.get and self._cache.set which can block the event loop; change those calls to run in a thread (e.g., await asyncio.to_thread(self._cache.get, cache_key) and await asyncio.to_thread(self._cache.set, cache_key, result)) and ensure asyncio is imported so aembed_text awaits the threaded calls around cache access while leaving the actual await self._embeddings.aembed_text call unchanged.
59-62: Inconsistent log levels between cache hit and miss.Cache hit uses
debuglevel (line 59) while cache miss usesinfolevel (line 62). This inconsistency makes it harder to track cache behavior at a single log level. Consider using the same level for both, or usingdebugfor hits andinfofor misses consistently across both sync and async methods (note: async method at line 81 usesinfofor hits).♻️ Align log levels
if cached_result is not None: - logger.debug("Embedding CACHE HIT for text (len=%d)", len(text)) + logger.info("Embedding CACHE HIT for text (len=%d)", len(text)) return cached_result🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/lightspeed_evaluation/core/embedding/ragas.py` around lines 59 - 62, The cache logging is inconsistent: the sync method logs "Embedding CACHE HIT for text (len=%d)" at logger.debug while the miss logs "Embedding CACHE MISS for text (len=%d) - calling API" at logger.info, and the async method logs hits at info; make these consistent by choosing a single level (recommended: logger.debug for hits and logger.info for misses) and update all occurrences of the strings "Embedding CACHE HIT for text (len=%d)" and "Embedding CACHE MISS for text (len=%d) - calling API" (both sync and async embedding methods that reference logger and cached_result/text) so hits and misses use the agreed levels across the file.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/lightspeed_evaluation/core/embedding/ragas.py`:
- Around line 37-65: The cache key must include kwargs to avoid collisions:
change _get_cache_key to accept an additional kwargs: Dict[str, Any] parameter
and incorporate a deterministic serialization of kwargs (e.g.,
json.dumps(kwargs, sort_keys=True, separators=(',',':'))), falling back to a
stable repr for non-serializable values, then combine/sha256-hash that
serialization with self._model_name and the text to produce the final key;
update embed_text to call _get_cache_key(text, kwargs) before cache lookup and
ensure the same key generation is used wherever _get_cache_key/_cache are used
(symbols: _get_cache_key, embed_text, _model_name, _embeddings, _cache).
---
Nitpick comments:
In `@src/lightspeed_evaluation/core/embedding/ragas.py`:
- Line 33: The Cache instance assigned to self._cache is never explicitly
closed; add explicit cleanup by implementing a public close() method that calls
self._cache.close() (and handles repeated calls idempotently), and optionally
implement __del__ to call close() so long-running processes can release
resources; update the class that creates self._cache (the one in ragas.py where
"self._cache: Cache = Cache(cache_dir)" appears) to include these methods and
ensure any external callers can call close() when finished.
- Around line 67-89: The async method aembed_text currently calls the
synchronous cache methods self._cache.get and self._cache.set which can block
the event loop; change those calls to run in a thread (e.g., await
asyncio.to_thread(self._cache.get, cache_key) and await
asyncio.to_thread(self._cache.set, cache_key, result)) and ensure asyncio is
imported so aembed_text awaits the threaded calls around cache access while
leaving the actual await self._embeddings.aembed_text call unchanged.
- Around line 59-62: The cache logging is inconsistent: the sync method logs
"Embedding CACHE HIT for text (len=%d)" at logger.debug while the miss logs
"Embedding CACHE MISS for text (len=%d) - calling API" at logger.info, and the
async method logs hits at info; make these consistent by choosing a single level
(recommended: logger.debug for hits and logger.info for misses) and update all
occurrences of the strings "Embedding CACHE HIT for text (len=%d)" and
"Embedding CACHE MISS for text (len=%d) - calling API" (both sync and async
embedding methods that reference logger and cached_result/text) so hits and
misses use the agreed levels across the file.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: f7d5cb5d-085a-4cb3-9b60-b3e491554952
📒 Files selected for processing (1)
src/lightspeed_evaluation/core/embedding/ragas.py
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/lightspeed_evaluation/core/metrics/ragas.py`:
- Around line 55-66: When building litellm.cache, respect the per-feature flags:
only include completion types ("completion","acompletion") in
supported_call_types when llm_cache_enabled is true, and only include embedding
types ("embedding","aembedding") when embedding_cache_enabled is true; if
neither flag is set, do not create the Cache. Also choose the disk_cache_dir
based on which feature is enabled (use llm_manager.get_config().cache_dir when
llm_cache_enabled, embedding_manager.get_config().cache_dir when only
embedding_cache_enabled, and prefer the LLM dir or a sensible common dir when
both are enabled). Update the Cache construction (Cache, LiteLLMCacheType.DISK,
supported_call_types, disk_cache_dir) accordingly and keep the existing
litellm.cache guard.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 4981aacc-23f7-48d5-8c01-d2e7cbfecc45
📒 Files selected for processing (2)
src/lightspeed_evaluation/core/embedding/ragas.pysrc/lightspeed_evaluation/core/metrics/ragas.py
✅ Files skipped from review due to trivial changes (1)
- src/lightspeed_evaluation/core/embedding/ragas.py
|
@bsatapat-jpg this seems like a robust solution. |
Description
The previous implementation used a custom CachedEmbedding class that wrapped LangChain embeddings. This approach was incompatible with Ragas 0.4+ which uses BaseRagasEmbedding interface and embedding_factory().
What this PR adds:
Type of change
Tools used to create PR
Identify any AI code assistants used in this PR (for transparency and review context)
Related Tickets & Documents
Checklist before requesting a review
Testing
Summary by CodeRabbit