Summary
Add intelligent conversation summarization as an alternative to simple trimming for long sessions.
Priority: Low - Only implement if users report losing important context with trimming.
Context
PR #35 adds token trimming which caps conversation history at 6K tokens. This works well for most cases, but may lose important context in very long sessions.
Proposed Implementation
Add a SummarizationNode that triggers when history exceeds a threshold:
def summarize_if_needed(state: BaseAgentState) -> dict:
messages = state.get("messages", [])
token_count = count_tokens_approximately(messages)
if token_count > 10000: # Threshold before summarizing
# Keep last 3-5 messages verbatim
recent = messages[-4:]
old = messages[:-4]
# Summarize old messages (hidden from user)
summary_prompt = f"Summarize this conversation context concisely:\n{old}"
summary = llm.invoke(summary_prompt)
return {
"messages": [
SystemMessage(content=f"Previous context: {summary.content}"),
*recent
],
"context_summary": summary.content,
}
return {}
Trade-offs
| Approach |
Pros |
Cons |
| Trimming (current) |
Zero extra LLM calls, fast |
May lose older context |
| Summarization |
Preserves key info |
Extra LLM call per summary |
When to Implement
Only implement if users report losing important context with the current trimming approach.
Related
Summary
Add intelligent conversation summarization as an alternative to simple trimming for long sessions.
Priority: Low - Only implement if users report losing important context with trimming.
Context
PR #35 adds token trimming which caps conversation history at 6K tokens. This works well for most cases, but may lose important context in very long sessions.
Proposed Implementation
Add a
SummarizationNodethat triggers when history exceeds a threshold:Trade-offs
When to Implement
Only implement if users report losing important context with the current trimming approach.
Related