Anthropic LLMObs plugin crashes on extended thinking (thinking content blocks) in streaming responses

## Bug Report

### Tracer Version(s)

5.93.0 (latest as of 2026-03-30)

### Node.js Version(s)

22.x

### Description

The Anthropic LLMObs plugin silently crashes when processing streaming responses from Claude models with extended thinking enabled (`thinking: { type: 'enabled' }`). After the first crash, the plugin is disabled for the entire process lifetime, resulting in all subsequent LLM Observability spans having empty input/output data and zero token metrics.

The Python tracer already has a fix for this: [dd-trace-py#16821](https://github.com/DataDog/dd-trace-py/pull/16821) ("feat(llmobs): add reasoning/extended thinking support for Anthropic and LiteLLM"), merged 2026-03-11. The JS tracer does not have an equivalent fix.

### Root Cause

In `packages/dd-trace/src/llmobs/plugins/anthropic.js`, the streaming chunk handler accumulates chunks and processes them when `done === true`. The issue is in three handlers:

**1. `content_block_start` (lines ~41-51)** only handles `text` and `tool_use` block types:

```javascript
if (type === 'text') {
  response.content.push({ type, text: contentBlock.text })
} else if (type === 'tool_use') {
  response.content.push({ type, name: contentBlock.name, input: '', id: contentBlock.id })
}
// 'thinking' type falls through — nothing pushed to response.content
```

**2. `content_block_delta` (lines ~53-64)** tries to append to the last content entry:

```javascript
if (text) response.content[response.content.length - 1].text += text
```

If a `thinking` block is the first content block, `response.content` is empty, so `response.content[-1]` is `undefined` → TypeError.

**3. `content_block_stop` (lines ~66-72)** also accesses the last entry:

```javascript
const type = response.content[response.content.length - 1].type
```

Same crash — `response.content[-1]` is `undefined` when the preceding block was `thinking`.

The error is caught by the `addSub` wrapper in `plugin.js` (lines ~124-132), which calls `this.configure(false)` — **disabling the entire AnthropicLLMObsPlugin for the rest of the process**. This means not just the current request, but all future Anthropic requests in this process will have empty LLMObs data.

### Reproduction

```typescript
import Anthropic from '@anthropic-ai/sdk';

const client = new Anthropic();

// Enable dd-trace with LLMObs
// DD_LLMOBS_ENABLED=1 DD_LLMOBS_ML_APP=test-app

const stream = client.messages.stream({
  model: 'claude-sonnet-4-5-20250929',
  max_tokens: 1024,
  thinking: { type: 'enabled', budget_tokens: 4096 },
  messages: [{ role: 'user', content: 'What is 2+2?' }],
});

for await (const event of stream) {
  // Process events normally
}

const message = await stream.finalMessage();
console.log(message.content);

// Check Datadog LLM Observability → span will show:
// - "No input or output messages"  
// - "Total Tokens: 0"
```

### Evidence from test cassettes

All existing test cassettes in the dd-trace-js repo explicitly use `"thinking":{"type":"disabled"}` in their request bodies, confirming thinking blocks have never been tested:
- `anthropic_v1_messages_post_a0f05e2e.yaml`
- `anthropic_v1_messages_post_bc4df306.yaml`
- `anthropic_v1_messages_post_08bab2b6.yaml`

### Suggested Fix

Port the approach from [dd-trace-py#16821](https://github.com/DataDog/dd-trace-py/pull/16821):

1. Handle `thinking` blocks in `content_block_start` — push an entry to `response.content` (or track a separate index to skip)
2. Handle `thinking_delta` in `content_block_delta`
3. Safely handle `content_block_stop` when the current block is `thinking`
4. Optionally capture thinking content as `role: "reasoning"` messages (matching the Python implementation)
5. Add test cassettes with `thinking: { type: 'enabled' }` for both streaming and non-streaming

### Workaround

Selectively disable only the LLMObs sub-plugin while keeping APM tracing active, then use manual `llmobs.trace()` + `llmobs.annotate()`:

```javascript
const tracer = require('dd-trace').init();
tracer.use('anthropic', { llmobs: false });
```

### Impact

Extended thinking is a key feature of Claude models. Any application using `thinking: { type: 'enabled' }` with streaming (which is the common case for chat UIs) will get zero LLM Observability data from the Anthropic auto-instrumentation.

### Setup

- dd-trace 5.93.0
- @anthropic-ai/sdk 0.80.0
- LLM Observability enabled via DD agent
- Node.js 22.x

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Anthropic LLMObs plugin crashes on extended thinking (thinking content blocks) in streaming responses #7890

Bug Report

Tracer Version(s)

Node.js Version(s)

Description

Root Cause

Reproduction

Evidence from test cassettes

Suggested Fix

Workaround

Impact

Setup

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Anthropic LLMObs plugin crashes on extended thinking (thinking content blocks) in streaming responses #7890

Description

Bug Report

Tracer Version(s)

Node.js Version(s)

Description

Root Cause

Reproduction

Evidence from test cassettes

Suggested Fix

Workaround

Impact

Setup

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions