Skip to content

Fix SSE streams stalling by using dedicated OTP actors#62

Merged
dcrockwell merged 3 commits intodevelopfrom
feature/sse-mist-upgrade
Mar 11, 2026
Merged

Fix SSE streams stalling by using dedicated OTP actors#62
dcrockwell merged 3 commits intodevelopfrom
feature/sse-mist-upgrade

Conversation

@dcrockwell
Copy link
Copy Markdown
Contributor

Why

The existing sse_response function used chunked transfer encoding under the hood. Mist converted Stream(yielder) to Chunked(yielder), which meant the yielder iterated inside the mist handler process's mailbox. That mailbox also receives TCP messages (ACKs, window updates), so after 2–3 events the two would contend and the SSE stream would stall indefinitely.

This is the same class of problem Mist solved for WebSockets — long-lived connections need their own OTP actor with a dedicated mailbox.

What

New module: dream/servers/mist/sse

A complete SSE API that mirrors the WebSocket module's design:

  • upgrade_to_sse — upgrades an HTTP request to an SSE connection backed by a dedicated OTP actor (uses the same stash-and-upgrade pattern as websocket.gleam)
  • send_event — sends structured events to the client
  • Event builders — event, event_name, event_id, event_retry for constructing SSE events with a pipeline syntax
  • Action helpers — continue_connection, continue_connection_with_selector, stop_connection for controlling the actor lifecycle
  • Opaque types (SSEConnection, Event, Action) that hide Mist internals from user code

Deprecation

response.sse_response is deprecated with a doc comment explaining the stalling bug and directing users to the new module. It is not removed — existing code continues to compile.

Testing

  • Unit tests for event builders and action wrappers
  • Tested documentation snippets (every code example from the docs compiles and runs)
  • Full example app (examples/sse/) with Cucumber integration tests that verify events stream without stalling

Documentation

  • New docs/guides/sse.md — comprehensive guide covering concepts, lifecycle, event builders, broadcasting, client-side EventSource, and testing
  • Updated docs/reference/streaming-api.md with the new API reference
  • Updated docs/guides/streaming.md to point to the new SSE guide

Version bump to 2.4.0 with CHANGELOG and release notes.

How

The new module follows the established stash-and-upgrade pattern from websocket.gleam:

  1. Handler stashes the raw Mist request in the process dictionary before calling the controller
  2. Controller calls upgrade_to_sse, which retrieves the stashed Mist request
  3. upgrade_to_sse calls mist.server_sent_events with wrapped on_init and on_message callbacks
  4. Mist spawns a dedicated OTP actor for the connection — SSE events and TCP messages no longer share a mailbox
  5. The Mist response is stashed back, and the handler returns it to Mist instead of the dummy Dream response

User-facing callbacks (on_init, on_message) follow Dream's no-closures pattern — dependencies are passed explicitly rather than captured.

Test plan

  • Unit tests pass (gleam test)
  • Documentation snippets compile and run
  • Example app builds successfully
  • Integration tests (make test-integration in examples/sse/)
  • Manual verification: connect with EventSource and confirm events stream continuously

## Why This Change Was Made
- The existing `sse_response` function used chunked transfer encoding (`Stream(yielder)`) which Mist converted to `Chunked(yielder)`. The yielder blocked in the mist handler process's mailbox, competing with TCP messages (ACKs, window updates), causing SSE streams to stall after 2-3 events.
- A proper SSE implementation requires a dedicated OTP actor per connection with its own mailbox, matching how Mist handles WebSockets.

## What Was Changed
- Added `src/dream/servers/mist/sse.gleam` module with `upgrade_to_sse`, `send_event`, event builders (`event`, `event_name`, `event_id`, `event_retry`), and action helpers (`continue_connection`, `continue_connection_with_selector`, `stop_connection`)
- Follows the same stash-and-upgrade pattern as `websocket.gleam` — retrieves the raw Mist request from the process dictionary, calls `mist.server_sent_events`, stashes the resulting response
- Deprecated `response.sse_response` with doc comment directing users to the new module
- Added unit tests, tested documentation snippets, and a full `examples/sse/` example app with Cucumber integration tests
- Added `docs/guides/sse.md` guide, updated `docs/reference/streaming-api.md` and `docs/guides/streaming.md`
- Bumped version to 2.4.0, updated CHANGELOG.md, created release notes

## Note to Future Engineer
- Dream's `SSEConnection` wraps Mist's opaque `SSEConnection` which wraps an internal `Connection(body, socket, transport)`. If you ever need raw socket access (e.g., for `send_raw`), capture socket/transport from `mist_request.body` at upgrade time rather than cracking open the opaque type later. Your future self will thank you when Mist bumps a minor version.
- The old `sse_response` is deprecated but not removed. If you're reading this in 2028 wondering why it's still here — congratulations, you found the tech debt. The `data:` prefix adds itself, like a clingy coworker who cc's themselves on every email.
@dcrockwell dcrockwell self-assigned this Mar 11, 2026
@dcrockwell dcrockwell added bug Something isn't working enhancement New feature or request labels Mar 11, 2026
## Why This Change Was Made
- CI was hanging indefinitely during SSE integration tests because the readiness check curled the `/events` SSE endpoint, which is a long-lived stream that never closes
- `curl -s http://localhost:8081/events` connects, receives the 200 + headers, then blocks forever waiting for the stream to end — so the readiness loop never advances

## What Was Changed
- Added a `/health` endpoint to the SSE example router that returns a plain 200 "ok" response
- Changed the Makefile readiness check from `/events` to `/health`

## Note to Future Engineer
- Every other example curls a normal HTTP endpoint for readiness. SSE and WebSocket endpoints are long-lived — never use them for health checks unless you enjoy watching CI spin like a loading screen from 2005.
## Why This Change Was Made
- CI was hanging because the "SSE endpoint returns correct headers" scenario did a synchronous `HTTPoison.get` to the `/events` SSE endpoint
- SSE streams never end, and the server sends events every second, so `recv_timeout` never fires (data IS being received) — the call hangs forever

## What Was Changed
- Merged the header assertions into the "SSE endpoint streams events" scenario, which already uses async streaming
- The SSE connect step now captures response headers from `AsyncHeaders` instead of discarding them
- Added a new step definition `the SSE response header {string} should contain {string}` for header assertions on async SSE connections
- Removed the standalone headers scenario that used synchronous HTTP

## Note to Future Engineer
- Never use synchronous HTTP requests to test SSE or WebSocket endpoints. They stream forever. Use async connections and check headers from the handshake phase.
- If you're tempted to add `recv_timeout: 1_000` as a "fix" — it won't work. The server keeps sending data, so recv_timeout resets with every event. It's not a timeout problem, it's a "this stream literally never ends" problem.
@dcrockwell dcrockwell merged commit 216ed17 into develop Mar 11, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant