Skip to content

feat: Implement TracingChannel support#2089

Merged
PavelPashov merged 19 commits intoredis:mainfrom
logaretm:feat/tracing-channel
Apr 7, 2026
Merged

feat: Implement TracingChannel support#2089
PavelPashov merged 19 commits intoredis:mainfrom
logaretm:feat/tracing-channel

Conversation

@logaretm
Copy link
Copy Markdown
Contributor

@logaretm logaretm commented Mar 11, 2026

This PR implements tracing channels for command and connect operations. This allows APM libraries (OTEL, Datadog, Sentry) to instrument redis commands without monkey-patching.

It also enables users to setup their own custom instrumentations and analytics.

This is a port of redis/node-redis#3195 to ioredis, based on the proposal in redis/node-redis#2590.

Tracing Channels

All channels in this PR use the Node.js TracingChannel API, which provides start, end, asyncStart, asyncEnd, and error sub-channels automatically.

TracingChannel Tracks Context fields
ioredis:command Individual Redis command execution from queue to reply (standalone, MULTI, and pipeline) command, args, database, serverAddress, serverPort, and optionally batchMode, batchSize for MULTI/pipeline commands
ioredis:connect Client connection lifecycle (initial connections and reconnections) serverAddress, serverPort, connectionEpoch

Context Properties

The payload sent in the trace events closely follows what OTEL currently extracts as attributes in their instrumentations:

Field Source OTEL attribute it enables
command command name db.operation.name
args full args array db.query.text (APM serializes/sanitizes)
database selected DB db.namespace
serverAddress socket options host (or path for IPC) server.address
serverPort socket options port (undefined for IPC) server.port

For batch commands we include these properties:

Field Source OTEL attribute it enables
batchMode 'PIPELINE' or 'MULTI' db.operation.name prefix
batchSize number of user commands (excludes MULTI/EXEC protocol wrappers) db.operation.batch.size

For connect events:

Field Source Description
connectionEpoch internal reconnection counter 0 for initial connect, 1+ for reconnections. APMs can use this to distinguish initial connections, detect flapping, or correlate traces across reconnection cycles.

Backward Compatibility

Zero-cost when no subscribers are registered. Silently skipped on Node.js versions where TracingChannel is unavailable (e.g. Node 18).

const dc = ('getBuiltinModule' in process)
  ? process.getBuiltinModule('node:diagnostics_channel')
  : require('node:diagnostics_channel');

Usage Examples

Consumers will be doing something like this:

  const dc = require('node:diagnostics_channel');

  // Trace individual Redis commands
  dc.tracingChannel('ioredis:command').subscribe({
    start(ctx) {
      console.log(`→ ${ctx.command}`, ctx.args, `db=${ctx.database}`);
      // APM: create span here
      // ctx.span = tracer.startSpan(`redis ${ctx.command}`, { attributes: {
      //   'db.system': 'redis',
      //   'db.operation.name': ctx.command,
      //   'db.namespace': String(ctx.database),
      //   'server.address': ctx.serverAddress,
      //   'server.port': ctx.serverPort,
      // }});
    },
    asyncEnd(ctx) {
      console.log(`✓ ${ctx.command} done`);
      // ctx.span?.end();
    },
    error(ctx) {
      console.error(`✗ ${ctx.command} failed:`, ctx.error?.message);
      // ctx.span?.setStatus({ code: SpanStatusCode.ERROR });
    },
  });

  // Trace connection
  dc.tracingChannel('ioredis:connect').subscribe({
    start(ctx) {
      console.log(`Connecting to ${ctx.serverAddress}:${ctx.serverPort} (epoch: ${ctx.connectionEpoch})`);
    },
    asyncEnd() {
      console.log('Connected');
    },
    error(ctx) {
      console.error('Connection failed:', ctx.error?.message);
    },
  });

  // Works with all ioredis operations — no monkey-patching needed
  const redis = new Redis();
  await redis.set('foo', 'bar');          // ioredis:command { command: 'SET', args: ['foo', 'bar'] }
  await redis.get('foo');                  // ioredis:command { command: 'GET', args: ['foo'] }

  // Pipeline/MULTI commands include batch metadata
  const pipeline = redis.pipeline();
  pipeline.set('a', '1');
  pipeline.set('b', '2');
  await pipeline.exec();                  // ioredis:command { command: 'SET', batchMode: 'PIPELINE', batchSize: 2 }

Future Improvements

  • Offline queue tracing channel: A dedicated ioredis:command:queue channel could emit events when commands enter the offline queue, allowing APMs to measure queue wait time separately from command execution time and track how many commands hit the offline queue (a signal of connection instability). Note that the existing OTEL monkey-patch instrumentation (@opentelemetry/instrumentation-ioredis) has the same gap — it doesn't distinguish queued vs. written commands either.
  • Sentinel-aware server address: _getServerAddress() currently uses configured host/port options. For sentinel setups, the trace context could reflect the actual resolved endpoint rather than the configured defaults.

Note

Medium Risk
Touches core Redis.connect() and sendCommand() paths to wrap operations in tracing helpers; while intended to be zero-overhead when unused, mistakes could affect command execution flow or error handling. Adds argument sanitization rules to avoid leaking secrets, but incorrect redaction logic could still expose or over-redact data.

Overview
Adds Node.js diagnostics_channel/TracingChannel instrumentation for ioredis operations, emitting telemetry for individual commands (ioredis:command), MULTI batches (ioredis:batch), and connection attempts (ioredis:connect) when subscribers are present (silently no-op otherwise).

Redis.connect() is now traced via traceConnect, and sendCommand() wraps the written command promise with traceCommand exactly once per command (using a new Command.isTraced flag) to avoid duplicate events during offline-queue flushes or command resends after reconnect.

Introduces lib/tracing.ts with safe-by-default argument sanitization (redacting sensitive values via sanitizeArgs) and exported context types, updates docs with subscription examples, and adds comprehensive functional tests validating tracing behavior, deduplication, error tracing, and lack of unhandled rejections.

Written by Cursor Bugbot for commit 79d3220. This will update automatically on new commits. Configure here.

Copilot AI review requested due to automatic review settings March 11, 2026 22:44
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds Node diagnostics_channel tracing support to ioredis so consumers can observe connect/command lifecycles (with near-zero overhead when unused), along with functional coverage and exported context types.

Changes:

  • Introduce lib/tracing.ts with traceCommand / traceConnect helpers and exported CommandContext/ConnectContext types.
  • Wrap Redis.connect() and Redis.sendCommand() to emit ioredis:connect and ioredis:command tracing events (including pipeline batch metadata).
  • Add functional tests validating tracing behavior (skipped on Node versions without TracingChannel).

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
test/functional/tracing.ts New functional test suite for ioredis:command / ioredis:connect tracing channels (Node-gated).
lib/tracing.ts Tracing channel integration + context type definitions and runtime compatibility shim.
lib/Redis.ts Wrap connect/command promises with tracing; build tracing contexts including server/db/batch info.
lib/Pipeline.ts Annotate queued commands with batchMode and batchSize for tracing.
lib/index.ts Export tracing context types for downstream consumers.
lib/Command.ts Add optional batchMode/batchSize fields to Command for pipeline/transaction metadata.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +33 to +40
try {
const redis = new Redis({ lazyConnect: true });
await redis.connect();
await redis.set("tracing-test-key", "tracing-test-value");
const result = await redis.get("tracing-test-key");
expect(result).to.eql("tracing-test-value");
redis.disconnect();

Copy link

Copilot AI Mar 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

redis.disconnect() is called inside the main try block, so if any assertion throws before it runs the test can leak an open socket/handle and make the suite hang or cascade-fail. Consider moving the disconnect/quit into a finally (or adding an afterEach that always disconnects created clients) so cleanup happens even on assertion failures.

Copilot uses AI. Check for mistakes.
Comment on lines +759 to +764
if ("path" in this.options && this.options.path) {
return { address: this.options.path, port: undefined };
}
return {
address: ("host" in this.options && this.options.host) || "localhost",
port: ("port" in this.options && this.options.port) || 6379,
Copy link

Copilot AI Mar 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_getServerAddress() currently falls back to localhost:6379 whenever options.host/port aren’t explicitly set, which can be misleading for non-standalone setups (e.g., sentinel mode where defaults are present but not the actual endpoint). Since tracing contexts rely on this, it can report incorrect serverAddress/serverPort. Consider preferring the actual connected endpoint (this.stream.remoteAddress/remotePort when available) and/or handling sentinel options (e.g., first sentinel endpoint) so traces reflect the real target.

Suggested change
if ("path" in this.options && this.options.path) {
return { address: this.options.path, port: undefined };
}
return {
address: ("host" in this.options && this.options.host) || "localhost",
port: ("port" in this.options && this.options.port) || 6379,
// Unix domain socket: path represents the endpoint.
if ("path" in this.options && this.options.path) {
return { address: this.options.path, port: undefined };
}
// Prefer the actual connected endpoint when available.
const stream = this.stream as NetStream | undefined;
const remoteAddress =
stream && typeof (stream as any).remoteAddress === "string"
? (stream as any).remoteAddress
: undefined;
const remotePort =
stream && typeof (stream as any).remotePort === "number"
? (stream as any).remotePort
: undefined;
if (remoteAddress) {
return { address: remoteAddress, port: remotePort };
}
let host: string | undefined =
"host" in this.options && this.options.host
? this.options.host
: undefined;
let port: number | undefined =
"port" in this.options && this.options.port
? this.options.port
: undefined;
// For sentinel setups, fall back to the first sentinel endpoint
// if an explicit host/port is not set.
if (
(!host || !port) &&
"sentinels" in this.options &&
Array.isArray((this.options as any).sentinels) &&
(this.options as any).sentinels.length > 0
) {
const firstSentinel = (this.options as any).sentinels[0];
if (!host && firstSentinel && typeof firstSentinel.host === "string") {
host = firstSentinel.host;
}
if (!port && firstSentinel && typeof firstSentinel.port === "number") {
port = firstSentinel.port;
}
}
return {
address: host || "localhost",
port: port ?? 6379,

Copilot uses AI. Check for mistakes.
@logaretm
Copy link
Copy Markdown
Contributor Author

@tlhunter @Qard I followed the same patterns here as well.

@PavelPashov
Copy link
Copy Markdown
Contributor

@logaretm Thanks for the PR. I’ll take a look when I get a chance and follow up with feedback.

@logaretm
Copy link
Copy Markdown
Contributor Author

@PavelPashov Thanks! happy to work as much as needed to get this through as long as no one is opposed to the idea 🙏

P.S: I created a PR for node-redis in redis/node-redis#2590 which had the original issue discussion.

@logaretm
Copy link
Copy Markdown
Contributor Author

@PavelPashov No rush, but let me know if we can jump on a call to discuss the PR if that may help move things along. Otherwise, I will be waiting for your 👀

@PavelPashov
Copy link
Copy Markdown
Contributor

@PavelPashov No rush, but let me know if we can jump on a call to discuss the PR if that may help move things along. Otherwise, I will be waiting for your 👀

Hey @logaretm,

I'm on the official Redis discord server, feel free to message me @pavelpashov_16203.

I answered in this thread let's continue the github discussion there.

@logaretm
Copy link
Copy Markdown
Contributor Author

logaretm commented Apr 2, 2026

@PavelPashov i brought this PR to match node-redis one, it includes sanitization and the adjustments to the batch channel. Also added a doc section for it.

return !!channel && channel.hasSubscribers !== false;
}

const noop = () => {};
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Redundant local noop duplicates existing utility function

Low Severity

tracing.ts defines its own const noop = () => {}; despite the project already exporting an identical noop from utils/lodash.ts (used in Redis.ts and Pipeline.ts). This creates a redundant definition that could drift or confuse maintainers. Unless there's a circular-dependency concern, reusing the existing utility is cleaner.

Fix in Cursor Fix in Web

@PavelPashov
Copy link
Copy Markdown
Contributor

PavelPashov commented Apr 3, 2026

@logaretm I think that a plain pipeline is mostly a transport/write optimization. It changes how commands are flushed, but it does not create a higher-level semantic operation boundary. So I do not think we should promote pipeline membership into batch semantics. (Likely the pipeline batch type will be removed from node-redis as well)

MULTI is different. In the default pipeline: true path, ioredis is explicitly building a local MULTI ... EXEC operation and flushing it as one unit. That is the one place where a batch span makes sense semantically.

Because of that, I think the batch tracing should live in transaction.ts, not in the generic pipeline executor. For example:

      const batchSize = Math.max(this.length - 2, 0);
      for (let i = 0; i < this._queue.length; ++i) {
        this._queue[i].batchMode = "MULTI";
      }
      const promise = traceBatch(
        () => exec.call(pipeline),
        () => buildBatchContext(this.redis, batchSize)
      );

This keeps the model straightforward:

  • standalone commands: command traces
  • commands inside pipeline(): still command traces
  • multi() / multi({ pipeline: true }): one MULTI batch trace
  • multi({ pipeline: false }): no batch trace unless we explicitly decide to model connection-scoped transactions separately

I think that separation is both cleaner semantically and simpler to reason about than trying to infer batch semantics from generic pipeline execution.

Bottom line: lets remove batch tracing from Pipeline.ts and keep MULTI batch tracing in transaction.ts.

logaretm and others added 5 commits April 3, 2026 11:31
Add two TracingChannel instances (`ioredis:command` and `ioredis:connect`)
that APM libraries (OTEL, Datadog, Sentry) can subscribe to without
monkey-patching. Zero-cost when no subscribers are attached.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Use WRONGTYPE error (HSET on string key) and assert that error events
are actually emitted with the expected context fields.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
CommandContext → CommandTraceContext
ConnectContext → ConnectTraceContext
Added BatchCommandTraceContext as separate type extending CommandTraceContext

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
PIPELINE and MULTI instead of pipeline and multi.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
TracingChannel.hasSubscribers is undefined on Node versions that support
TracingChannel but predate the hasSubscribers property. A truthiness check
skips tracing on those versions. The new helper checks !== false so that
undefined (missing property) traces unconditionally.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
logaretm and others added 12 commits April 3, 2026 11:31
For MULTI transactions, batchSize now counts only user commands.
EXEC already has inTransaction=false (exec() decrements _transactions
before sendCommand), and MULTI is excluded by name since multi()
increments _transactions before sendCommand runs.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Exposes connectionEpoch in the ioredis:connect trace context so APM
consumers can distinguish initial connections (epoch 0) from
reconnections (epoch 1+) and correlate traces across reconnection cycles.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
When a command was queued offline, traceCommand wrapped command.promise
at queue time. When the connection became ready and the command was
re-sent via sendCommand, it was wrapped again — producing duplicate
start/end/error trace events and inflated latency in the first span.

Move tracing to the write path only so offline-queued commands are
traced once when actually sent to Redis.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Node's TracingChannel.tracePromise returns a wrapper promise that
re-rejects on error via PromiseReject(err). When Pipeline discards
the sendCommand return value, this wrapper has no rejection handler,
causing unhandledRejection events that don't exist without tracing.

Silence the wrapper with .catch(noop) so tracing remains transparent.
The error trace event still fires and callers that await the promise
still see the rejection through their own .then() chain.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Adapts sanitization rules from @opentelemetry/redis-common to redact
sensitive values before emitting them on tracing channels. Commands like
AUTH and HELLO get all args redacted, SET keeps only the key, HSET keeps
key + field, and read-only commands like GET/DEL remain fully visible.
Unlisted/custom commands default to full redaction for safety.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Aligns with node-redis channel architecture: MULTI is now traced as a
single batch operation on ioredis:batch instead of per-command events on
ioredis:command. Pipeline commands remain per-command with batchMode and
batchSize annotations.

Also expands sanitizeArgs test coverage to match node-redis: adds tests
for SET with flags, SETEX prefix matching, LPUSH/PUBLISH/MSET/ZADD/XADD
key-only redaction, LSET/LINSERT key+field redaction, and full visibility
for SUBSCRIBE/CONFIG/EVAL/HMGET.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Documents the three TracingChannel channels (ioredis:command,
ioredis:batch, ioredis:connect), argument sanitization behavior, and
exported context types. Also exports BatchOperationContext from the
package entry point.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
TracingChannel requires Node.js >= 20.13.0, not 18.19.0. On older
versions the channels are silently unavailable. Also fixes a stale
comment that said "two tracing channels" when there are now three.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
TracingChannel was backported to Node 18.19.0. Fixes the docs and the
test skip logic which incorrectly required Node >= 20.13.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Pipeline.redis can be a Cluster instance which doesn't have
_buildBatchContext. Check before calling to prevent runtime errors
when MULTI is used through a Cluster.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@logaretm logaretm force-pushed the feat/tracing-channel branch from 3233177 to f32abc0 Compare April 3, 2026 15:31
…on.ts

Per maintainer feedback, plain pipelines are transport optimizations
and should not carry batch semantics. Only MULTI creates a semantic
operation boundary worth tracing as a batch.

- Remove batchMode/batchSize from Command
- Remove BatchCommandTraceContext type
- Pipeline commands now emit plain ioredis:command traces
- MULTI batch tracing moved from Pipeline.ts to transaction.ts
- Commands within MULTI also appear on ioredis:command

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@logaretm logaretm force-pushed the feat/tracing-channel branch from f32abc0 to 3654d9c Compare April 3, 2026 15:33
After a reconnect, commands from prevCommandQueue are re-sent through
sendCommand where writable is true, causing them to be traced a second
time. Add an isTraced flag on Command to ensure each command is only
traced once regardless of how many times it passes through sendCommand.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

There are 2 total unresolved issues (including 1 from previous review).

Fix All in Cursor

return {
address: ("host" in this.options && this.options.host) || "localhost",
port: ("port" in this.options && this.options.port) || 6379,
};
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Falsy port value incorrectly defaults to 6379 in trace context

Low Severity

_getServerAddress uses || for the port fallback, which coerces any falsy port value (like 0) to 6379. The expression ("port" in this.options && this.options.port) || 6379 evaluates to 6379 when the configured port is 0 because 0 is falsy in JavaScript. Using nullish coalescing (??) instead of || would correctly preserve an explicit port of 0. The same pattern affects the address field with an empty-string host. The existing _getDescription() method in the same class handles this.options.port directly without this fallback issue.

Fix in Cursor Fix in Web

Copy link
Copy Markdown
Contributor

@PavelPashov PavelPashov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@PavelPashov PavelPashov merged commit 4760e0a into redis:main Apr 7, 2026
12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants