Skip to content

fix: scan full session files for keyword hits#64

Merged
awsl233777 merged 1 commit intoSakuraByteCore:mainfrom
SurviveM:fix/session-search-tail-query-scan
Apr 2, 2026
Merged

fix: scan full session files for keyword hits#64
awsl233777 merged 1 commit intoSakuraByteCore:mainfrom
SurviveM:fix/session-search-tail-query-scan

Conversation

@SurviveM
Copy link
Copy Markdown
Collaborator

@SurviveM SurviveM commented Apr 2, 2026

Summary

  • scan session files line by line so keyword search can reach tail messages
  • stop treating the default content scan as a head-only byte slice
  • add an e2e regression for tail-only keyword hits that remain visible in session detail

Tests

  • npm run test:unit
  • npm run test:e2e

Summary by CodeRabbit

  • New Features

    • Session content searches now use efficient streaming approach, improving performance when searching large session files
    • Sessions can now be searched throughout their entire content, not limited to initial segments
    • Better byte-budget control options for content scan operations
  • Bug Fixes

    • Enhanced error handling and recovery during session content search operations

@github-actions
Copy link
Copy Markdown

github-actions bot commented Apr 2, 2026

@coderabbitai re-review
Stop making breaking changes, do a proper review!

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Apr 2, 2026

@github-actions[bot]: Understood. I'll conduct a proper re-review of PR #64, focusing strictly on correctness, safety, and quality within the existing design — no breaking changes, no refactoring suggestions, no API/interface/behavior alterations.

Re-review PR #64 (fix: scan full session files for keyword hits). Strictly limit all feedback to correctness, safety, and quality only within the existing design. Do not suggest any breaking changes, refactoring, or alterations to existing APIs, interfaces, or behavior. The fix involves scanning session files line by line instead of using a head-only byte slice, and adds an e2e regression test for tail-only keyword hits.


🧠 Learnings used
Learnt from: github-actions[bot]
Repo: SakuraByteCore/codexmate PR: 0
File: :0-0
Timestamp: 2026-03-30T04:47:15.339Z
Learning: In PR `#55` for SakuraByteCore/codexmate, github-actions[bot] repeatedly flagged that CodeRabbit was introducing breaking changes. The maintainer's absolute constraint is: no breaking changes in any review suggestion; limit all feedback to correctness, safety, and quality only within the existing design, without refactoring or altering APIs, interfaces, or behavior.

Learnt from: github-actions[bot]
Repo: SakuraByteCore/codexmate PR: 0
File: :0-0
Timestamp: 2026-03-31T12:37:50.988Z
Learning: In PR `#59` for SakuraByteCore/codexmate, github-actions[bot] repeatedly flagged that CodeRabbit was introducing breaking changes. The maintainer's absolute constraint is: no breaking changes in any review suggestion; limit all feedback to correctness, safety, and quality only within the existing design, without refactoring or altering APIs, interfaces, or behavior.

Learnt from: SurviveM
Repo: SakuraByteCore/codexmate PR: 0
File: :0-0
Timestamp: 2026-03-29T02:42:32.761Z
Learning: The repository maintainer SurviveM (GitHub: SurviveM, repo: SakuraByteCore/codexmate) prefers that CodeRabbit review suggestions do not introduce breaking changes. Reviews should focus on correctness, safety, and quality without recommending refactors or changes that would break existing APIs, interfaces, or behavior.
✅ Actions performed

Full review triggered.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Apr 2, 2026

Caution

Review failed

Pull request was closed or merged during review

📝 Walkthrough

Walkthrough

Session content scanning refactored from synchronous fixed-head-segment loading to asynchronous streaming line-by-line JSONL parsing. New helper functions manage scan state, result building, and message filtering with byte budget control. Default byte budget behavior changed to 0. Test coverage added for late-appearing content queries in sessions.

Changes

Cohort / File(s) Summary
Core Scanning Logic
cli.js
Replaced synchronous scanSessionContentForQuery with async streaming implementation using fs.createReadStream and readline. Introduced state management helpers (createSessionQueryScanState, consumeSessionQueryMessage, buildSessionQueryScanResult) for handling role filtering, token matching, and snippet accumulation. Updated callers (applySessionQueryFilter, listAllSessions, listAllSessionsData) to use async/await. Changed default contentScanBytes from SESSION_CONTENT_READ_BYTES to 0; added fallback for stream errors using scanSessionContentForQueryInRecords.
Test Fixtures & Coverage
tests/e2e/test-setup.js, tests/e2e/test-session-search.js
Extended e2e test setup to create "late-keyword" session fixture with 32 padded message records followed by target Chinese keywords. Added late content query test block asserting session-detail retrieval, list-sessions search with queryScope: 'content', and validation of match metadata with hit status and snippet extraction.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Poem

🐰 Streams now dance where once we read,
Async bytes flow at perfect speed,
Late-keyword sessions finally found,
With snippets spinning all around!
A rabbit's hop through JSON land—
Line by line, oh isn't it grand!

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title directly and concisely summarizes the main change: switching session file scanning to process full files line-by-line instead of just reading a head segment, enabling keyword detection in tail messages.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Collaborator

@awsl233777 awsl233777 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed the current fix. CI is green and the change stays within the intended behavior: it scans full session files for keyword hits and includes regression coverage for tail-only matches. Approving.

@awsl233777 awsl233777 merged commit f10af72 into SakuraByteCore:main Apr 2, 2026
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants