-
Notifications
You must be signed in to change notification settings - Fork 67
Open
Labels
Description
Problem
AgentConfig::streaming exists as a field but is not implemented. Long LLM responses block for 10-30 seconds with no feedback. Applications need token-by-token streaming for responsive UIs.
Current State
streamingfield exists inAgentConfig(default: false)- Not wired to httplib — the field is ignored
- No SSE (Server-Sent Events) parsing
- No streaming callback
Proposed Solution
SSE parser
Parse text/event-stream responses from OpenAI-compatible endpoints. Extract data: lines, handle [DONE] sentinel.
Streaming HTTP
Use httplib's set_response_handler() with chunked transfer encoding. Feed chunks to SSE parser.
Token callback
using StreamCallback = std::function<void(const std::string& token)>;Called for each token as it arrives during LLM inference.
OutputHandler integration
New virtual method:
virtual void printStreamToken(const std::string& token);TerminalConsole prints immediately; SilentConsole buffers silently.
Response accumulation
Collect all tokens into a complete response string for JSON parsing after the stream completes. The existing multi-strategy JSON parser operates on the complete response.
Reactions are currently unavailable