-
Notifications
You must be signed in to change notification settings - Fork 383
Memory leak in Next.js plugin: WeakMap in #requestsBySpanId retains IncomingMessage objects #7876
Description
Summary
PR #7000 introduced a memory leak in the Next.js plugin (packages/datadog-plugin-next/src/index.js). HTTP IncomingMessage objects are retained indefinitely because the WeakMap key (_spanId Identifier object) lives as long as the span context, which is held by the profiler and analytics pipeline.
Root Cause
PR #7000 changed request storage from a WeakMap keyed by span objects to a WeakMap keyed by span.context()._spanId (an Identifier object). The original implementation used a Map with explicit cleanup in finish(), but during review (commit 5035e06) this was changed to a WeakMap and the cleanup was removed under the assumption that WeakMap garbage collection would handle it.
The assumption is incorrect because:
- The WeakMap key is
span.context()._spanId— anIdentifierobject that lives inside the span context - The span context is retained after
span.finish()by the profiler (ProfilingContextcache), the analytics pipeline (analytics: true), and the trace exporter - As long as the span context is alive,
_spanIdis alive, and the WeakMap entry retaining the fullIncomingMessage(with all headers, URL, body, Next.js request metadata) cannot be garbage collected
In the previous version (v5.58), the WeakMap was keyed by the span object itself, which had a shorter lifetime and was released when it left the async storage after finish().
Before (v5.58, no leak):
this._requests = new WeakMap() // keyed by span object
this._requests.set(span, req) // span is short-lived
// No cleanup needed — span leaves async store after finish(), WeakMap drops entryAfter (v5.87+, leak):
#requestsBySpanId = new WeakMap() // keyed by _spanId Identifier
this.#requestsBySpanId.set(spanId, req) // spanId lives as long as span context
// No cleanup — spanId retained by profiler/analytics, req pinned indefinitelyOriginal version in PR #7000 (had working cleanup):
this._requestsBySpanId = new Map() // Map with string keys
this._requestsBySpanId.set(spanId, req)
// In finish():
this._requestsBySpanId.delete(spanId) // explicit cleanup — removed in reviewEvidence
Heap snapshot comparison from a production Next.js portal (before vs after process restart):
| Allocator | Before Restart | After Restart | Leaked |
|---|---|---|---|
get originalRequest (next/base-http/node.js) |
19 MiB | <1 MiB | -18 MiB |
| URL objects (node:internal/url) | 60 MiB | 6 MiB | -55 MiB |
structuredClone |
18 MiB | 1 MiB | -17 MiB |
addRequestMeta (next/request-meta.js) |
10 MiB | 1 MiB | -9 MiB |
requestHandlerImpl (next/router-server.js) |
12 MiB | 4 MiB | -8 MiB |
_createContext (dd-trace/span.js) |
14 MiB | 5 MiB | -8 MiB |
| HTTP incoming headers | 16 MiB | 5 MiB | -12 MiB |
| External (native) | 318 MiB | 160 MiB | -158 MiB |
The retained objects are all reachable through the IncomingMessage stored in the #requestsBySpanId WeakMap.
Suggested Fix
Add explicit cleanup in finish(), same as the original implementation had before it was removed:
finish ({ req, res, nextRequest = {} }) {
const store = storage('legacy').getStore()
if (!store) return
const span = store.span
// ... existing error handling and tag setting ...
span.finish()
// Cleanup: Remove request reference so IncomingMessage can be GC'd
const spanId = span.context()._spanId
this.#requestsBySpanId.delete(spanId)
}Environment
- dd-trace: Regression introduced after v5.58, confirmed present in v5.87 and v5.93
- Next.js: 16.1.6 (App Router)
- Node.js: 22.x
- Tracer config:
analytics: true,runtimeMetrics: true, profiling enabled
Reproduction
- Deploy a Next.js App Router application with dd-trace >= 5.87
- Enable
analytics: trueor profiling in tracer config - Send sustained traffic over several hours
- Compare heap snapshots —
IncomingMessageobjects accumulate proportionally to request count
The leak does not occur with dd-trace 5.58 which used the span object (not span ID) as the WeakMap key.