Skip to content

A92: xDS ExtAuthz Support#481

Open
markdroth wants to merge 32 commits intogrpc:masterfrom
markdroth:xds_ext_authz
Open

A92: xDS ExtAuthz Support#481
markdroth wants to merge 32 commits intogrpc:masterfrom
markdroth:xds_ext_authz

Conversation

@markdroth
Copy link
Member

No description provided.

@markdroth markdroth marked this pull request as ready for review September 18, 2025 22:50
@markdroth markdroth requested a review from dfawley September 18, 2025 22:50
will be set to the local address of the connection that the request
came in on.
- [principal](https://github.com/envoyproxy/envoy/blob/cdd19052348f7f6d85910605d957ba4fe0538aec/api/envoy/service/auth/v3/attribute_context.proto#L82):
If TLS is used, this will be set to the server's cert's first URI SAN
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tls package in Go does not provide a way to retrieve the local certificate used in the TLS handshake. We had filed an issue for this with the Go team ages ago, but not much progress there: golang/go#24673

If the configured channel credentials contain a certificate provider though, we would be able to retrieve the local certs from the provider. But the provider could return more than one cert (if the server is serving multiple domain names and expecting the client to specify an SNI during the TLS handshake). Do other languages have a mechanism to retrieve the actual cert used for the handshake?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In C-core, we control the TLS code, so I think we'll be able to handle this.

I'm not sure how hard this will be in Java and Go. If we can't support it, I'm open to dropping it from the gRFC.

I'd like input from @ejona86, @dfawley, @matthewstevenson88, and @gtcooke94 on this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is open PR to support this in Go: golang/go#75699. The changes seem very straightforward, but we might need some pushing to get this through since we originally asked for this about 6 years ago.

- [disallow_all](https://github.com/envoyproxy/envoy/blob/cdd19052348f7f6d85910605d957ba4fe0538aec/api/envoy/config/common/mutation_rules/v3/mutation_rules.proto#L70)
- [allow_expression](https://github.com/envoyproxy/envoy/blob/cdd19052348f7f6d85910605d957ba4fe0538aec/api/envoy/config/common/mutation_rules/v3/mutation_rules.proto#L75)
- [disallow_expression](https://github.com/envoyproxy/envoy/blob/cdd19052348f7f6d85910605d957ba4fe0538aec/api/envoy/config/common/mutation_rules/v3/mutation_rules.proto#L79)
- [disallow_is_error](https://github.com/envoyproxy/envoy/blob/cdd19052348f7f6d85910605d957ba4fe0538aec/api/envoy/config/common/mutation_rules/v3/mutation_rules.proto#L87)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The envoy docs say when this is true, and if the rules in this list cause a header mutation to be disallowed, then the filter using this configuration will terminate the request with a 500 error.

Does this mean Unavailable for us or Unknown?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to the normal HTTP-to-gRPC status mapping, an HTTP 500 status would map to UNKNOWN.

I could also see an argument for using INTERNAL here. I don't feel strongly, but maybe @ejona86 or @dfawley have thoughts on this.


The following server-side metrics will be exported:

| Name | Type | Unit | Labels | Description |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it make sense to add the data plane RPC method name as a label to the server-side metrics? Or would that be considered a risk for cardinality explosion?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, that might make sense. But if we do that, then we probably also need to apply the same filtering as described in A66 for per-call metrics, so that we don't have a cardinality explosion for non-generated method names.

This makes me realize that these are really per-call metrics, but we'd be defining them via non-per-call metrics framework. @ctiller, are we okay with that for C-core performance-wise, given the work we're doing to move collection into C-core instead of in the stats plugins, or do we need to consider alternatives here?

@ejona86 ejona86 self-requested a review November 24, 2025 23:29
Copy link
Member

@ejona86 ejona86 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My comments are essentially all editorial/clarity. So I don't have concerns. I did notice grpc.client_ext_authz is an interesting namespace for the otel metrics, but I think I see why you did it. I didn't want to bikeshed over it.

- [method](https://github.com/envoyproxy/envoy/blob/cdd19052348f7f6d85910605d957ba4fe0538aec/api/envoy/service/auth/v3/attribute_context.proto#L115):
Will always be "POST".
- [header_map](https://github.com/envoyproxy/envoy/blob/cdd19052348f7f6d85910605d957ba4fe0538aec/api/envoy/service/auth/v3/attribute_context.proto#L143):
The filter will iterate through each header on the data plane RPC.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't have headers at this point. Should we say "metadata entries" for clarity?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand what you mean by "we don't have headers at this point". The filter only sends the ext_authz request when it sees the request headers, so I'm not sure why we wouldn't have them yet,

I've attempted to clarify the wording here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After staring at it for a while, I figured out what I meant.

The filter is running before the transport, so we haven't actually computed HTTP/2 headers yet; we aren't going to necessarily have stuff like :path or grpc-timeout set. Instead, should we say this is just the application-provided grpc metadata? Or do we need to emulate certain headers here.

- [method](https://github.com/envoyproxy/envoy/blob/cdd19052348f7f6d85910605d957ba4fe0538aec/api/envoy/service/auth/v3/attribute_context.proto#L115):
Will always be "POST".
- [header_map](https://github.com/envoyproxy/envoy/blob/cdd19052348f7f6d85910605d957ba4fe0538aec/api/envoy/service/auth/v3/attribute_context.proto#L143):
The filter will iterate through each header on the data plane RPC.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using raw headers doesn't necessitate HeaderValue.raw_value. The idea is we'd use raw_value for -bin headers? Or we can just use whatever is most convenient (e.g., if an implementation already had it base64-encoded, it could just use that)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See below for discussion about value vs. raw_value for -bin headers.

The base64 encoding question is a good one, though -- I hadn't thought about that. If we do always base64-encode the value for -bin headers, then that would probably eliminate the need for the raw_value field entirely. But we might not want the performance overhead of base64 encoding if we're using the true binary metadata extension (gRFC G1). Maybe we could say that implementations would use raw_value only when using that extension, and only for -bin headers?

Honestly, this is getting complicated enough that I almost wonder whether we should simply say that we don't support sending -bin headers at all.

Thoughts...?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd assume Envoy will always use value for -bin headers. So if gRPC uses raw_value, the ext_authz server will need to support both value and raw_value for -bin headers. That means it'd be fine to say "implementations are free to use whichever one they want, whenever they want," because it's no more effort for the receiver. If we want to make it easier for ext_authz servers, we'd probably need to avoid raw_value (again, assuming Envoy's behavior).

Or do we not care about Envoy's behavior? envoy.config.core.v3.HeaderValueOption is requiring raw_value in a way that Envoy probably isn't. Actually... I think we may be misunderstanding raw_value. I think this is to support obs-text (0x80-0xFF); I don't think it is the "treat this as base64 data, but we've not actually encoded it" semantics that we use for binary headers. If that is indeed the case, then we must base64 encode.

I don't really care. I just thought we should define the expectation. In Java the raw_value is actually faster because our API works with the raw bytes. That's probably true for some other languages.

The header name. The entry will be ignored if empty, if not all
lower-case, if length exceeds 16384, if the key is `host`, or if it
starts with `:`.
- [value](https://github.com/envoyproxy/envoy/blob/cdd19052348f7f6d85910605d957ba4fe0538aec/api/envoy/config/core/v3/base.proto#L415):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If value is used for a -bin header, what should be the behavior?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, my current proposal is that the name of the header dictates which field we look at, so we would never look at the value field for a -bin header.

I'm open to other approaches here if we like something else better. For example, we could instead just say that we always look at both, and raw_value takes precedence over value if both are set. But then we have to define what happens if raw_value is used for a non--bin header and contains illegal characters.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm okay with almost anything; I just thought we need to define what should happen. Blindly using the field, even when empty, seems within reason.

The header name. The entry will be ignored if empty, if not all
lower-case, if length exceeds 16384, if the key is `host`, or if it
starts with `:`.
- [value](https://github.com/envoyproxy/envoy/blob/cdd19052348f7f6d85910605d957ba4fe0538aec/api/envoy/config/core/v3/base.proto#L415):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The proto says only one of value and raw_value may be set. How should gRPC behave if that is not the case?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See above. With my current proposal, we wouldn't care; the name of the header would dictate which field we look at, so if the other one is also set, we simply wouldn't care. But I'm open to alternatives here if we don't like that.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the idea is we won't enforce the proto's documentation. That can be fine.

Copy link
Member Author

@markdroth markdroth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did notice grpc.client_ext_authz is an interesting namespace for the otel metrics, but I think I see why you did it.

I wanted to phrase this in a way that made it clear that the metrics are talking about ext_authz on the gRPC client vs. gRPC server side, as opposed to the client and server of the ext_authz communication itself. (This filter will always be the client side for the ext_authz communication, even if the filter is running on the gRPC server side.)

- [method](https://github.com/envoyproxy/envoy/blob/cdd19052348f7f6d85910605d957ba4fe0538aec/api/envoy/service/auth/v3/attribute_context.proto#L115):
Will always be "POST".
- [header_map](https://github.com/envoyproxy/envoy/blob/cdd19052348f7f6d85910605d957ba4fe0538aec/api/envoy/service/auth/v3/attribute_context.proto#L143):
The filter will iterate through each header on the data plane RPC.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand what you mean by "we don't have headers at this point". The filter only sends the ext_authz request when it sees the request headers, so I'm not sure why we wouldn't have them yet,

I've attempted to clarify the wording here.

- [method](https://github.com/envoyproxy/envoy/blob/cdd19052348f7f6d85910605d957ba4fe0538aec/api/envoy/service/auth/v3/attribute_context.proto#L115):
Will always be "POST".
- [header_map](https://github.com/envoyproxy/envoy/blob/cdd19052348f7f6d85910605d957ba4fe0538aec/api/envoy/service/auth/v3/attribute_context.proto#L143):
The filter will iterate through each header on the data plane RPC.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See below for discussion about value vs. raw_value for -bin headers.

The base64 encoding question is a good one, though -- I hadn't thought about that. If we do always base64-encode the value for -bin headers, then that would probably eliminate the need for the raw_value field entirely. But we might not want the performance overhead of base64 encoding if we're using the true binary metadata extension (gRFC G1). Maybe we could say that implementations would use raw_value only when using that extension, and only for -bin headers?

Honestly, this is getting complicated enough that I almost wonder whether we should simply say that we don't support sending -bin headers at all.

Thoughts...?

The header name. The entry will be ignored if empty, if not all
lower-case, if length exceeds 16384, if the key is `host`, or if it
starts with `:`.
- [value](https://github.com/envoyproxy/envoy/blob/cdd19052348f7f6d85910605d957ba4fe0538aec/api/envoy/config/core/v3/base.proto#L415):
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, my current proposal is that the name of the header dictates which field we look at, so we would never look at the value field for a -bin header.

I'm open to other approaches here if we like something else better. For example, we could instead just say that we always look at both, and raw_value takes precedence over value if both are set. But then we have to define what happens if raw_value is used for a non--bin header and contains illegal characters.

The header name. The entry will be ignored if empty, if not all
lower-case, if length exceeds 16384, if the key is `host`, or if it
starts with `:`.
- [value](https://github.com/envoyproxy/envoy/blob/cdd19052348f7f6d85910605d957ba4fe0538aec/api/envoy/config/core/v3/base.proto#L415):
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See above. With my current proposal, we wouldn't care; the name of the header would dictate which field we look at, so if the other one is also set, we simply wouldn't care. But I'm open to alternatives here if we don't like that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants