Skip to content

Anthropic LLMObs plugin crashes on extended thinking (thinking content blocks) in streaming responses #7890

@alejandronanez

Description

@alejandronanez

Bug Report

Tracer Version(s)

5.93.0 (latest as of 2026-03-30)

Node.js Version(s)

22.x

Description

The Anthropic LLMObs plugin silently crashes when processing streaming responses from Claude models with extended thinking enabled (thinking: { type: 'enabled' }). After the first crash, the plugin is disabled for the entire process lifetime, resulting in all subsequent LLM Observability spans having empty input/output data and zero token metrics.

The Python tracer already has a fix for this: dd-trace-py#16821 ("feat(llmobs): add reasoning/extended thinking support for Anthropic and LiteLLM"), merged 2026-03-11. The JS tracer does not have an equivalent fix.

Root Cause

In packages/dd-trace/src/llmobs/plugins/anthropic.js, the streaming chunk handler accumulates chunks and processes them when done === true. The issue is in three handlers:

1. content_block_start (lines ~41-51) only handles text and tool_use block types:

if (type === 'text') {
  response.content.push({ type, text: contentBlock.text })
} else if (type === 'tool_use') {
  response.content.push({ type, name: contentBlock.name, input: '', id: contentBlock.id })
}
// 'thinking' type falls through — nothing pushed to response.content

2. content_block_delta (lines ~53-64) tries to append to the last content entry:

if (text) response.content[response.content.length - 1].text += text

If a thinking block is the first content block, response.content is empty, so response.content[-1] is undefined → TypeError.

3. content_block_stop (lines ~66-72) also accesses the last entry:

const type = response.content[response.content.length - 1].type

Same crash — response.content[-1] is undefined when the preceding block was thinking.

The error is caught by the addSub wrapper in plugin.js (lines ~124-132), which calls this.configure(false)disabling the entire AnthropicLLMObsPlugin for the rest of the process. This means not just the current request, but all future Anthropic requests in this process will have empty LLMObs data.

Reproduction

import Anthropic from '@anthropic-ai/sdk';

const client = new Anthropic();

// Enable dd-trace with LLMObs
// DD_LLMOBS_ENABLED=1 DD_LLMOBS_ML_APP=test-app

const stream = client.messages.stream({
  model: 'claude-sonnet-4-5-20250929',
  max_tokens: 1024,
  thinking: { type: 'enabled', budget_tokens: 4096 },
  messages: [{ role: 'user', content: 'What is 2+2?' }],
});

for await (const event of stream) {
  // Process events normally
}

const message = await stream.finalMessage();
console.log(message.content);

// Check Datadog LLM Observability → span will show:
// - "No input or output messages"  
// - "Total Tokens: 0"

Evidence from test cassettes

All existing test cassettes in the dd-trace-js repo explicitly use "thinking":{"type":"disabled"} in their request bodies, confirming thinking blocks have never been tested:

  • anthropic_v1_messages_post_a0f05e2e.yaml
  • anthropic_v1_messages_post_bc4df306.yaml
  • anthropic_v1_messages_post_08bab2b6.yaml

Suggested Fix

Port the approach from dd-trace-py#16821:

  1. Handle thinking blocks in content_block_start — push an entry to response.content (or track a separate index to skip)
  2. Handle thinking_delta in content_block_delta
  3. Safely handle content_block_stop when the current block is thinking
  4. Optionally capture thinking content as role: "reasoning" messages (matching the Python implementation)
  5. Add test cassettes with thinking: { type: 'enabled' } for both streaming and non-streaming

Workaround

Selectively disable only the LLMObs sub-plugin while keeping APM tracing active, then use manual llmobs.trace() + llmobs.annotate():

const tracer = require('dd-trace').init();
tracer.use('anthropic', { llmobs: false });

Impact

Extended thinking is a key feature of Claude models. Any application using thinking: { type: 'enabled' } with streaming (which is the common case for chat UIs) will get zero LLM Observability data from the Anthropic auto-instrumentation.

Setup

  • dd-trace 5.93.0
  • @anthropic-ai/sdk 0.80.0
  • LLM Observability enabled via DD agent
  • Node.js 22.x

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions