CEL input: Add OTel tracing by chrisberkhout · Pull Request #48440 · elastic/beats

chrisberkhout · 2026-01-16T08:50:53Z

Proposed commit message

CEL input: Add OTel tracing (#)

Instruments the CEL Input with OpenTelemetry tracing. Sampling is 100% -
all operation is covered. By default no exporter is set up and traces
will not be exported. Export can be configured to go to the console or
to an OTLP endpoint using the `grpc` (default) or `http/protobuf`
protocols.

Typically OTel tracing considers the whole process to be the "resource".
However, in this case the resource is the input instance. For that
reason a trace provider is created specifically for the input instance
and it is not explicitly set as the global tracer provider.

There is an extra environment variable to override any other
configuration and disable export for a specific input:
`BEATS_OTEL_TRACES_DISABLE=cel`.

Spans covering HTTP requests are enriched with attributes for request
and response headers, with values automatically (but configurably)
redacted to protect sensitive data.

Normal request logging and Filebeat logs will include span and trace IDs
that allow correlation with the OTel data. This is done in any location
to which we can pass a logger from the trace creation site. Other
Filebeat logging will lack the IDs. Because logger attributes are
append-only we pass around a logger with modified attributes rather than
modify attributes in a global logger.

Normal request logging had unused functionality for including a
`trace.id` field. That has been removed in favor of an OTel-specific
implementation that adds `trace.id` and `span.id` if there is a current,
valid span.

Requests initiated by CEL will have spans added by `otelhttp` and will
identify the correct parent span using trace data from the request
context. Since the relevant eval-time context is not propagated to those
requests by mito, cel-go[1] or oauth2[2], `ContextInjector` is used to
rewrite each request to include the current context as it is processed.

[1]: https://github.com/google/cel-go/issues/557
[2]: https://github.com/golang/oauth2/issues/262

There were a couple of things for which the initial approach changed:

Use of https://pkg.go.dev/go.opentelemetry.io/contrib/exporters/autoexport to interpret OTel environment variables and set up the exporter was removed in favor of manual handling, which seems to be standard when using the Go SDK (unlike implementations in some other languages).
The context with OTel tracing data needs to be propagated the HTTP client used by CEL so that HTTP spans are attached to the correct parent span. That was initially done with a change in Mito: Add HTTPWithContextFnOpts so requests can have eval-time context mito#118. That has been closed to avoid changing Mito. Now it is done in the CEL Input by having ContextInjector rewrite requests in the client used by CEL, which also solves the problem for OAuth2 requests.

There are some differences from the attribute and other names given in the planning document:

cel.periodic.program_count
→ Changed to cel.periodic.execution_count to match cel.program.execution.
cel.program.batch_count
→ Removed. It would only indicate whether an execution returned any events or not. Any other batching is internal to the CEL evaluation.
cel.{periodic,program}.success
→ Removed, in favor of span status.
cel.program.error_message
→ Not set. Uses SetStatus and RecordError instead.
BEATS_OTEL_TRACING_DISABLE
→ Changed to BEATS_OTEL_TRACES_DISABLE to match OTEL_TRACES_EXPORTER and OTEL_EXPORTER_OTLP_TRACES_*.

Handling of span-specific context and loggers is somewhat cumbersome. Refactoring to extract separate functions from run for separate stages of processing will help to tidy this up and is planned as follow-up work: #48464.

Checklist

My code follows the style guidelines of this project
I have commented my code, particularly in hard-to-understand areas
I have made corresponding changes to the documentation
I have made corresponding change to the default configuration files
I have added tests that prove my fix is effective or that my feature works. Where relevant, I have used the stresstest.sh script to run them under stress conditions and race detector to verify their stability.
I have added an entry in ./changelog/fragments using the changelog tool.

How to test this PR locally

You can use otel-desktop-viewer as simple receiver and viewer of OTel traces:

# Install it
go install github.com/CtrlSpice/otel-desktop-viewer@latest

# Run it. It will open its web UI
otel-desktop-viewer

# In another terminal, set it as the destination for OTel traces
export OTEL_TRACES_EXPORTER=otlp
export OTEL_EXPORTER_OTLP_TRACES_PROTOCOL=grpc
export OTEL_EXPORTER_OTLP_TRACES_ENDPOINT=http://localhost:4317

In the terminal with those environment variables set, you can run the input with an example that includes OAuth2 and multiple requests per period, like this:

(cd x-pack/filebeat && go build) && ./x-pack/filebeat/filebeat run -c <(echo '
filebeat.inputs:
- type: cel
  enabled: true
  id: cel-1
  interval: 5s
  resource.url: https://api.ipify.org/?format=json&passwd=mysecretword
  program: |
    get(state.url).Body.as(body, state.with({
        "events": [body.decode_json()],
        "want_more": int(state.?runcount.orValue(1)) % 3 != 0,
        "runcount": int(state.?runcount.orValue(1)) + 1,
    }))
  resource.tracer.enable: true
  resource.tracer.filename: "x-pack/filebeat/logs/cel/http-request-trace-cel-*.ndjson"
  auth.oauth2.enabled: true
  auth.oauth2.client.id: someclientid
  auth.oauth2.client.secret: someclientsecret
  auth.oauth2.scopes: scope.me
  auth.oauth2.token_url: https://oauth-mock.mock.beeceptor.com/oauth/token/github
  auth.oauth2.endpoint_params:
    grant_type: client_credentials
  otel.trace.redacted:
    - User-Agent
  otel.trace.unredacted:
    - Authorization
output.elasticsearch:
  hosts: ["https://elasticsearch:9200"]
  username: "elastic"
  password: "changeme"
  protocol: "https"
  ssl.verification_mode: "none"
  preset: balanced
logging.level: debug
logging.to_stderr: true
')

You can also use Elastic Observability to receive and view OTel traces, but it involves a bit more setup.

Bring up the Elastic Stack:

elastic-package stack up -v

In Kibana, go to "Management > Integrations" and go to the "APM" integration page. Click "Manage APM integration in Fleet", then "Add Elastic APM". Under "Configure integration > Integration settings > General > Server configuration", change the Host and URL settings to use '0.0.0.0' instead of 'localhost'. Under "Where to add this integration?", choose "Existing hosts > Elastic Agent (elastic-package)". Then click "Save and continue".

Now, back in the terminal, find the IP address of the agent container.

docker ps # confirm the agent container name is elastic-package-stack-elastic-agent-1
AGENT="elastic-package-stack-elastic-agent-1"
AGENT_IP=$(docker inspect "$AGENT" \
  --format '{{ (index .NetworkSettings.Networks "elastic-package-stack_default").IPAddress }}')
echo "$AGENT_IP" # confirm the IP was found

Use that as the destination for OTel traces:

export OTEL_TRACES_EXPORTER=otlp
export OTEL_EXPORTER_OTLP_TRACES_PROTOCOL=grpc
export OTEL_EXPORTER_OTLP_TRACES_ENDPOINT="http://$AGENT_IP:8200"

Then from the terminal with those settings you can run the input using example Filebeat configuration as above.

To view the exported traces in Kibana, go to "Observability > Applications > Traces".

Use cases

This tracing is to be used for troubleshooting, particularly for Agentless.

Screenshots

OTel traces for the CEL Input in Elastic Observability:

github-actions · 2026-01-16T08:51:02Z

🤖 GitHub comments

Just comment with:

run docs-build : Re-trigger the docs validation. (use unformatted text in the comment!)

mergify · 2026-01-16T08:51:51Z

This pull request is now in conflicts. Could you fix it? 🙏
To fixup this pull request, you can check out it locally. See documentation: https://help.github.com/articles/checking-out-pull-requests-locally/

git fetch upstream
git checkout -b cel-otel-tracing upstream/cel-otel-tracing
git merge upstream/main
git push upstream cel-otel-tracing

mergify · 2026-01-16T08:51:51Z

This pull request does not have a backport label.
If this is a bug or security fix, could you label this PR @chrisberkhout? 🙏.
For such, you'll need to label your PR with:

The upcoming major version of the Elastic Stack
The upcoming minor version of the Elastic Stack (if you're not pushing a breaking change)

To fixup this pull request, you need to add the backport labels for the needed
branches, such as:

backport-8./d is the label to automatically backport to the 8./d branch. /d is the digit
backport-active-all is the label that automatically backports to all active branches.
backport-active-8 is the label that automatically backports to all active minor branches for the 8 major.
backport-active-9 is the label that automatically backports to all active minor branches for the 9 major.

mergify · 2026-01-23T19:39:49Z

This pull request is now in conflicts. Could you fix it? 🙏
To fixup this pull request, you can check out it locally. See documentation: https://help.github.com/articles/checking-out-pull-requests-locally/

git fetch upstream
git checkout -b cel-otel-tracing upstream/cel-otel-tracing
git merge upstream/main
git push upstream cel-otel-tracing

github-actions · 2026-01-30T16:14:41Z

🔍 Preview links for changed docs

docs/reference/filebeat/filebeat-input-cel.md

elasticmachine · 2026-02-02T16:38:58Z

Pinging @elastic/security-service-integrations (Team:Security-Service Integrations)

x-pack/filebeat/input/cel/input.go

x-pack/filebeat/input/internal/httplog/roundtripper.go

… URL.

… GetExporterTypeFromEnv() for the case of metrics off, traces on". The chage in "Tidy up GetExporterTypeFromEnv() for the case of metrics off, traces on" was to require OTEL_METRICS_EXPORTER to be set and not default it to otlp becuase an OTEL_EXPORTER_OTLP_ENDPOINT was set. This is important because otherwise configuration to send traces to an OTLP endpoint would activate the metrics export, which the endpoint may not be able to handle.

Make the tracer provider injectable via an optional field on the input struct so tests can capture spans with a SpanRecorder. Add three tests covering the happy path (single execution), the want_more loop (two executions per period), and an evaluation error, asserting on span names, parent-child relationships, attributes, and status codes.

The trace span tests were waiting for a fixed 5s timeout even when the first periodic run had already completed. This updates the tests to cancel when the root periodic run span (cel.periodic.run) is observed as ended in the span recorder. The timeout remains as a safety guard for regressions, but no longer drives normal test completion. Why this is safe: - assertions in these tests target spans and attributes produced inside that completed periodic run - cancellation now happens after the root run trace is complete, avoiding the earlier cancellation race during publish Result: tests keep the same tracing assertions while running in ~0.01s per test instead of ~5s.

x-pack/filebeat/otel/exporter_factory.go

x-pack/filebeat/input/cel/input.go

…TLP_ENDPOINT.

belimawr · 2026-03-17T13:58:05Z

x-pack/filebeat/otel/trace.go

+	if n < 0 {
+		return 0
+	}
+	return time.Duration(n) * time.Millisecond


This looks a bit odd to me. Why are you converting n to time.Duration if n is the number of milliseconds?

You can multiply a number directly by time.Millisecond and you'll have a time.Duration

Suggested change

return time.Duration(n) * time.Millisecond

return n * time.Millisecond

Thanks. Yeah, there no hidden reason.

Actually, it's required to make the types work, so I'm reverting to leave it as it was.

Something like 55 * time.Millisecond would work because 55 is an untyped constant.

Apparently time.Duration(n) * time.Millisecond is idiomatic and consistent with the library design.

This is language behaviour; n is not a time.Duration while time.Millisecond is, and Go does not have C-like implicit arithmetic type coercion.

x-pack/filebeat/otel/trace.go

belimawr · 2026-03-17T14:01:52Z

x-pack/filebeat/otel/trace.go

+		found = true
+		b, err := strconv.ParseBool(strings.TrimSpace(raw))
+		if err != nil {
+			return false, true, fmt.Errorf("%s must be boolean: %w", k, err)


[Question]

Does it make sense to return found = true on an error? The key is present, but the value is invalid... Could you elaborate more on the intended behaviour here?

I think not. I'll change that.

belimawr · 2026-03-17T14:32:32Z

x-pack/filebeat/otel/trace_test.go

+}
+
+func TestNewExporterCfgFromEnv_ExporterDefaultsToNone(t *testing.T) {
+	// unset BEATS_OTEL_TRACES_DISABLE


I don't get this comment, should it be removed?

Suggested change

// unset BEATS_OTEL_TRACES_DISABLE

The tests above and below set environment variables and make assertions about how the configuration is read from them.

For this test no setup is necessary. The comment is meant to indicate that the test makes assertions about the behavior when this variable is not set.

I don't mind removing it. What do you think?

x-pack/filebeat/otel/trace_test.go

belimawr · 2026-03-17T14:40:24Z

x-pack/filebeat/otel/trace_test.go

+
+func TestNewExporterCfgFromEnv_NotInsecureByDefault(t *testing.T) {
+	t.Setenv("OTEL_EXPORTER_OTLP_ENDPOINT", "otlp-receiver.example.com:4317")
+	// unset OTEL_EXPORTER_OTLP_INSECURE


Again, I'm confused by this comment.

Same again. Could make it look less magical, like this:

// With OTEL_EXPORTER_OTLP_INSECURE not set.

Co-authored-by: Tiago Queiroz <github@queiroz.life>

This reverts commit 116a0fc.

chrisberkhout self-assigned this Jan 16, 2026

chrisberkhout added enhancement Filebeat Filebeat Team:Security-Service Integrations Security Service Integrations Team labels Jan 16, 2026

botelastic bot added needs_team Indicates that the issue/PR needs a Team:* label and removed needs_team Indicates that the issue/PR needs a Team:* label labels Jan 16, 2026

chrisberkhout mentioned this pull request Jan 19, 2026

x-pack/filebeat/input/cel: misc refactoring #48464

Open

efd6 mentioned this pull request Jan 19, 2026

Add HTTPWithContextFnOpts so requests can have eval-time context elastic/mito#118

Closed

chrisberkhout force-pushed the cel-otel-tracing branch 2 times, most recently from 0adb194 to 41ebd88 Compare January 22, 2026 15:26

chrisberkhout force-pushed the cel-otel-tracing branch from 41ebd88 to 04a16c2 Compare January 30, 2026 16:13

github-actions bot deployed to docs-preview January 30, 2026 16:13 View deployment

chrisberkhout force-pushed the cel-otel-tracing branch from 04a16c2 to 14673f7 Compare February 2, 2026 15:39

github-actions bot deployed to docs-preview February 2, 2026 15:39 View deployment

github-actions bot deployed to docs-preview February 2, 2026 16:37 View deployment

chrisberkhout marked this pull request as ready for review February 2, 2026 16:38

chrisberkhout requested review from a team as code owners February 2, 2026 16:38

chrisberkhout requested review from faec and leehinman February 2, 2026 16:38

efd6 reviewed Feb 2, 2026

View reviewed changes

pierrehilbert added the Team:Elastic-Agent-Data-Plane Label for the Agent Data Plane team label Feb 3, 2026

chrisberkhout and others added 10 commits March 13, 2026 16:00

Guard to avoid an empty endpoint URL being transformed to a non-empty…

e9d04ef

… URL.

Fix test for single string valued url.full.

b46cf8a

Make update.

ad09afe

Tidy changelog entry.

940ace2

Update documentated release type and stack version.

8b94992

Emphasize the requirement of OTEL_TRACES_EXPORTER=otlp in doc.

3705c91

Make product lifecycle state be beta.

966d11f

chrisberkhout force-pushed the cel-otel-tracing branch from 48e1662 to 966d11f Compare March 13, 2026 15:09

github-actions bot deployed to docs-preview March 13, 2026 15:09 View deployment

andrewkroh reviewed Mar 17, 2026

View reviewed changes

x-pack/filebeat/otel/exporter_factory.go Outdated Show resolved Hide resolved

x-pack/filebeat/input/cel/input.go Outdated Show resolved Hide resolved

chrisberkhout added 2 commits March 17, 2026 11:07

Handle OTEL_EXPORTER_OTLP_METRICS_ENDPOINT as well as OTEL_EXPORTER_O…

b30aff6

…TLP_ENDPOINT.

Use a detached context for tracer provider shutdown.

1f2e20d

chrisberkhout requested a review from andrewkroh March 17, 2026 10:22

github-actions bot deployed to docs-preview March 17, 2026 10:22 View deployment

Change endpoint URL handling to match OTel spec.

0ef7a89

github-actions bot deployed to docs-preview March 17, 2026 10:57 View deployment

andrewkroh approved these changes Mar 17, 2026

View reviewed changes

belimawr reviewed Mar 17, 2026

View reviewed changes

Update x-pack/filebeat/otel/trace.go

116a0fc

Co-authored-by: Tiago Queiroz <github@queiroz.life>

github-actions bot had a problem deploying to docs-preview March 17, 2026 15:04 Failure

Apply suggestion from @belimawr

677e724

Co-authored-by: Tiago Queiroz <github@queiroz.life>

github-actions bot deployed to docs-preview March 17, 2026 15:05 View deployment

Update x-pack/filebeat/otel/trace_test.go

f0bd7ef

Co-authored-by: Tiago Queiroz <github@queiroz.life>

github-actions bot deployed to docs-preview March 17, 2026 15:11 View deployment

Revert "Update x-pack/filebeat/otel/trace.go"

ebfcce1

This reverts commit 116a0fc.

github-actions bot had a problem deploying to docs-preview March 17, 2026 15:33 Failure

Don't say 'found' if it's an error.

9808e12

	return time.Duration(n) * time.Millisecond
	return n * time.Millisecond

Conversation

chrisberkhout commented Jan 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Proposed commit message

Checklist

How to test this PR locally

Related

Use cases

Screenshots

Uh oh!

github-actions bot commented Jan 16, 2026

🤖 GitHub comments

Uh oh!

mergify bot commented Jan 16, 2026

Uh oh!

mergify bot commented Jan 16, 2026

Uh oh!

mergify bot commented Jan 23, 2026

Uh oh!

github-actions bot commented Jan 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔍 Preview links for changed docs

Uh oh!

elasticmachine commented Feb 2, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

11 participants

chrisberkhout commented Jan 16, 2026 •

edited

Loading

github-actions bot commented Jan 30, 2026 •

edited

Loading