Skip to content

C++ Framework: Streaming response support #360

@kovtcharov

Description

@kovtcharov

Problem

AgentConfig::streaming exists as a field but is not implemented. Long LLM responses block for 10-30 seconds with no feedback. Applications need token-by-token streaming for responsive UIs.

Current State

  • streaming field exists in AgentConfig (default: false)
  • Not wired to httplib — the field is ignored
  • No SSE (Server-Sent Events) parsing
  • No streaming callback

Proposed Solution

SSE parser

Parse text/event-stream responses from OpenAI-compatible endpoints. Extract data: lines, handle [DONE] sentinel.

Streaming HTTP

Use httplib's set_response_handler() with chunked transfer encoding. Feed chunks to SSE parser.

Token callback

using StreamCallback = std::function<void(const std::string& token)>;

Called for each token as it arrives during LLM inference.

OutputHandler integration

New virtual method:

virtual void printStreamToken(const std::string& token);

TerminalConsole prints immediately; SilentConsole buffers silently.

Response accumulation

Collect all tokens into a complete response string for JSON parsing after the stream completes. The existing multi-strategy JSON parser operates on the complete response.

Metadata

Metadata

Assignees

Labels

cppenhancementNew feature or requestp1medium priority

Type

No type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions