Skip to content

Memory leak in Next.js plugin: WeakMap in #requestsBySpanId retains IncomingMessage objects #7876

@krazykeith

Description

@krazykeith

Summary

PR #7000 introduced a memory leak in the Next.js plugin (packages/datadog-plugin-next/src/index.js). HTTP IncomingMessage objects are retained indefinitely because the WeakMap key (_spanId Identifier object) lives as long as the span context, which is held by the profiler and analytics pipeline.

Root Cause

PR #7000 changed request storage from a WeakMap keyed by span objects to a WeakMap keyed by span.context()._spanId (an Identifier object). The original implementation used a Map with explicit cleanup in finish(), but during review (commit 5035e06) this was changed to a WeakMap and the cleanup was removed under the assumption that WeakMap garbage collection would handle it.

The assumption is incorrect because:

  1. The WeakMap key is span.context()._spanId — an Identifier object that lives inside the span context
  2. The span context is retained after span.finish() by the profiler (ProfilingContext cache), the analytics pipeline (analytics: true), and the trace exporter
  3. As long as the span context is alive, _spanId is alive, and the WeakMap entry retaining the full IncomingMessage (with all headers, URL, body, Next.js request metadata) cannot be garbage collected

In the previous version (v5.58), the WeakMap was keyed by the span object itself, which had a shorter lifetime and was released when it left the async storage after finish().

Before (v5.58, no leak):

this._requests = new WeakMap()        // keyed by span object
this._requests.set(span, req)         // span is short-lived
// No cleanup needed — span leaves async store after finish(), WeakMap drops entry

After (v5.87+, leak):

#requestsBySpanId = new WeakMap()                    // keyed by _spanId Identifier
this.#requestsBySpanId.set(spanId, req)              // spanId lives as long as span context
// No cleanup — spanId retained by profiler/analytics, req pinned indefinitely

Original version in PR #7000 (had working cleanup):

this._requestsBySpanId = new Map()                   // Map with string keys
this._requestsBySpanId.set(spanId, req)
// In finish():
this._requestsBySpanId.delete(spanId)                // explicit cleanup — removed in review

Evidence

Heap snapshot comparison from a production Next.js portal (before vs after process restart):

Allocator Before Restart After Restart Leaked
get originalRequest (next/base-http/node.js) 19 MiB <1 MiB -18 MiB
URL objects (node:internal/url) 60 MiB 6 MiB -55 MiB
structuredClone 18 MiB 1 MiB -17 MiB
addRequestMeta (next/request-meta.js) 10 MiB 1 MiB -9 MiB
requestHandlerImpl (next/router-server.js) 12 MiB 4 MiB -8 MiB
_createContext (dd-trace/span.js) 14 MiB 5 MiB -8 MiB
HTTP incoming headers 16 MiB 5 MiB -12 MiB
External (native) 318 MiB 160 MiB -158 MiB

The retained objects are all reachable through the IncomingMessage stored in the #requestsBySpanId WeakMap.

Suggested Fix

Add explicit cleanup in finish(), same as the original implementation had before it was removed:

finish ({ req, res, nextRequest = {} }) {
  const store = storage('legacy').getStore()
  if (!store) return
  const span = store.span

  // ... existing error handling and tag setting ...

  span.finish()

  // Cleanup: Remove request reference so IncomingMessage can be GC'd
  const spanId = span.context()._spanId
  this.#requestsBySpanId.delete(spanId)
}

Environment

  • dd-trace: Regression introduced after v5.58, confirmed present in v5.87 and v5.93
  • Next.js: 16.1.6 (App Router)
  • Node.js: 22.x
  • Tracer config: analytics: true, runtimeMetrics: true, profiling enabled

Reproduction

  1. Deploy a Next.js App Router application with dd-trace >= 5.87
  2. Enable analytics: true or profiling in tracer config
  3. Send sustained traffic over several hours
  4. Compare heap snapshots — IncomingMessage objects accumulate proportionally to request count

The leak does not occur with dd-trace 5.58 which used the span object (not span ID) as the WeakMap key.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingintegration-nextjsissues relating to the Next.js framework from Vercel

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions