[APPSEC-61865] (WIP) Add inferred span and AppSec calls for API Gateway#136
[APPSEC-61865] (WIP) Add inferred span and AppSec calls for API Gateway#136
Conversation
Remove all AppSec lifecycle management (context creation, Event.record, export_metrics, Context.deactivate) — now owned by dd-trace-rb watcher. Listener becomes a dumb data provider: just pushes raw event/response hashes. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Lambda::AppSec owns full context lifecycle (create, activate, record, export, deactivate). Lambda::Trace::InferredSpan owns span creation and finishing. Listener becomes thin orchestrator. AppSec context now targets the inferred span (service-entry), so tags land there directly — tag propagation hack removed. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Remove [DEBUG:inject_appsec] debug logging from inject_appsec_data - Restore removed comments in send_end_invocation_request - Revert unnecessary ::Datadog:: namespace prefixes in lambda.rb Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…vices check F29: Move `instrument(:aws_lambda)` from per-invocation `AppSec.on_start` to `configure_apm` after yield block. The `instrument` method is a no-op when appsec is disabled (checks `enabled` internally), so no guard needed. Matches Python's pattern of one-time initialization. F30: Reuse `Datadog::Lambda.trace_managed_services?` in InferredSpan instead of duplicating the ENV reading logic. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
F31: 44 new examples covering: - AppSec.on_start: enabled/disabled, context creation, gateway push, fallback to active trace/span, missing security engine, error handling - AppSec.on_finish: no active context, gateway push, Event.record, metrics/telemetry export, context deactivation, deactivation on error - InferredSpan.create: managed services disabled, non-hash events, missing requestContext/stage/httpMethod, v1 span name/tags/resource_key, v2 span name/tags/resource_key, trace_digest continuation, empty domain, error handling - InferredSpan.finish: nil span, statusCode tagging, non-hash response, nil response, error handling Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
F25: `Datadog::Tracing.trace()` returns a SpanOperation, not a TraceOperation. Rename `@trace` to `@span` in Listener to reflect the actual type. Remove stale class-level `@trace = nil`. AppSec.on_start now always uses `Datadog::Tracing.active_trace` for the TraceOperation (matching Rack's RequestMiddleware pattern) and receives only the target span (inferred span) as an argument, with fallback to `active_span`. F26: Pass `@inferred_span || @span` to `inject_appsec_data` so the extension reads tags from the span where AppSec actually sets them. F32: Variable renamed from `@trace` to `@span`. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* Rewrite specs to test behavior instead of internals * Remove obsolete appsec files
Split the god method into ApiGatewayV1/V2 parser classes with a uniform interface, and rewire InferredSpan.create as a PARSERS dispatcher + build_span builder. Existing behavior tests pass unchanged. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
http.status_code on the inferred span is not in the RFC and not checked by system tests. Without it, finish() just delegates to span.finish — the listener now calls @inferred_span&.finish directly. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
| def domain = @request_context.fetch('domainName', '') | ||
| def api_id = @request_context.fetch('apiId', '') | ||
| def stage = @request_context.fetch('stage', '') | ||
| def request_time_ms = @request_context['requestTimeEpoch'] |
| def self.match?(payload) | ||
| api_gateway?(payload) && payload.key?('httpMethod') | ||
| end | ||
|
|
||
| private_class_method def self.api_gateway?(payload) | ||
| payload.is_a?(Hash) && | ||
| payload.key?('requestContext') && payload['requestContext'].key?('stage') | ||
| end |
There was a problem hiding this comment.
Makes sense to do class << self style
lib/datadog/lambda/trace/listener.rb
Outdated
| @span = nil | ||
| @inferred_span = nil |
There was a problem hiding this comment.
Is it needed within instance?
lib/datadog/lambda/trace/listener.rb
Outdated
| Datadog::Utils.send_end_invocation_request( | ||
| response:, span_id: @span.id, request_context: | ||
| ) |
There was a problem hiding this comment.
Single-line is better, and we can reorder empty kwargs to be last
lib/datadog/lambda/inferred_span.rb
Outdated
| def parser_for(event) | ||
| klass = PARSERS.find { |parser| parser.match?(event) } | ||
| klass&.new(event) | ||
| end |
There was a problem hiding this comment.
Irresponsible method with no benefits
lib/datadog/lambda/inferred_span.rb
Outdated
| } | ||
|
|
||
| arn = request_context.invoked_function_arn.to_s | ||
| region = arn.split(':')[3] if arn.include?(':') |
There was a problem hiding this comment.
No limit on split and magic number 3
lib/datadog/lambda/inferred_span.rb
Outdated
| private | ||
|
|
||
| def parser_for(event) | ||
| klass = PARSERS.find { |parser| parser.match?(event) } |
There was a problem hiding this comment.
We call it parsers here, but in fact it's value object style, and it's better to word it that way in code and docs as they provide object oriented access to the AWS event payload
lib/datadog/lambda/inferred_span.rb
Outdated
| def managed_services_enabled? | ||
| Datadog::Lambda.trace_managed_services? | ||
| end |
Removes hidden dependency on Datadog::Tracing.active_trace and active_span — the listener now passes both explicitly. Span selection policy (prefer inferred span over lambda span) moves to the caller. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
DD_TRACE_MANAGED_SERVICES controls outbound AWS SDK call tracing, not inbound inferred spans. The guard was added by mistake during extraction. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
| def span_name = 'aws.httpapi' | ||
| def method = @http['method'] | ||
| def path = @payload.fetch('rawPath', '/') | ||
| def resource_path = @payload['routeKey']&.sub(/^[A-Z]+ /, '') || path |
There was a problem hiding this comment.
- Does it makes sense to use
\Ainstead of^(security) - What this sub here is for?
|
|
||
| # Remove Parent ID if it is the same as the Span ID | ||
| request.delete(DD_PARENT_ID_HEADER) if request[DD_PARENT_ID_HEADER] == span_id.to_s | ||
|
|
| return unless context | ||
|
|
||
| Datadog::AppSec::Instrumentation.gateway.push('aws_lambda.response.start', response) | ||
| Datadog::AppSec::Event.record(context, request: context.state[:request]) |
There was a problem hiding this comment.
- Who creates this
requestin the code? - Seems tests are wrong doing here and over-stubbing
lib/datadog/lambda/inferred_span.rb
Outdated
| klass&.new(event) | ||
| end | ||
|
|
||
| def build_span(parser, request_context, trace_digest) |
There was a problem hiding this comment.
Method is massive now, and signature could use some help of kwargs for secondary arguments
lib/datadog/lambda/inferred_span.rb
Outdated
| tags: tags, | ||
| } | ||
| options[:continue_from] = trace_digest if trace_digest | ||
| options[:start_time] = Time.at(parser.request_time_ms / 1000.0) if parser.request_time_ms |
There was a problem hiding this comment.
Feels like small method and a constant
|
|
||
| yield(c) if block_given? | ||
|
|
||
| c.appsec.instrument(:aws_lambda) |
There was a problem hiding this comment.
Makes sense to add comment about instrumentation guard on enabled from AppSec settings class and no auto instrumentation
| { | ||
| 'httpMethod' => 'POST', | ||
| 'requestContext' => {'stage' => 'dev'}, | ||
| } |
| { | ||
| 'routeKey' => 'POST /data', | ||
| 'requestContext' => {'stage' => 'dev'}, | ||
| } |
| parent_id: '797643193680388254' | ||
| } | ||
| end | ||
| let(:ctx) { LambdaContextVersion.new } |
There was a problem hiding this comment.
what is needed from this test? Can we write a clean separate test with good standards?
| allow(Datadog::AppSec).to receive(:enabled?).and_return(true) | ||
| allow(Datadog::AppSec).to receive(:security_engine).and_return(security_engine) | ||
| allow(Datadog::AppSec::Context).to receive(:new).and_return(appsec_context) | ||
| allow(Datadog::AppSec::Context).to receive(:activate) | ||
| allow(Datadog::AppSec::Context).to receive(:active).and_return(appsec_context) |
There was a problem hiding this comment.
So many stubs, are all of them needed?
test/datadog/lambda/appsec.spec.rb
Outdated
|
|
||
| before { allow(Datadog::AppSec::Context).to receive(:active).and_return(nil) } | ||
|
|
||
| it 'does not activate or push' do |
There was a problem hiding this comment.
what doesn't activate, trace?
test/datadog/lambda/appsec.spec.rb
Outdated
| it 'does not activate or push' do | ||
| on_start | ||
|
|
||
| aggregate_failures do |
| end | ||
|
|
||
| context 'when context has a request in state' do | ||
| before { appsec_context.state[:request] = request_data } |
There was a problem hiding this comment.
This stubs stuff, but we miss actual push on the gateway because request data is never given in actual code
| describe '.create' do | ||
| subject(:created_span) { described_class.create(event, request_context, trace_digest) } | ||
|
|
||
| after { created_span&.finish unless created_span&.finished? } |
There was a problem hiding this comment.
Why do we do that here, we are not coding, we controlling environment, we should not have to do safe-navigation in tests at all
|
|
||
| describe Datadog::Lambda::InferredSpan do | ||
| let(:request_context) { LambdaContextVersion.new } | ||
| let(:trace_digest) { nil } |
| end | ||
|
|
||
| context 'when event has no httpMethod or routeKey' do | ||
| let(:event) { {'requestContext' => {'stage' => 'prod'}} } |
There was a problem hiding this comment.
should use let/subject instead, hard to read test
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…nstants Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…in continue_from test Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
80f3ad5 to
eee4562
Compare
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
What does this PR do?
Compute inferred span for AWS Lambda that suppose to be a service-entry span. In addition - initialize AppSec component (if possible), push events to it and flush captured security events.
Motivation
Part of Endpoint Discovery & Correlation from Inferred Spans RFC
Testing Guidelines
Additional Notes
I would need some assistance to get more context over testing.
Types of changes