Skip to content

feat: support user identity forwarding to tasks via TaskAuthContext#19236

Open
jtuglu1 wants to merge 1 commit intoapache:masterfrom
jtuglu1:task-auth-context
Open

feat: support user identity forwarding to tasks via TaskAuthContext#19236
jtuglu1 wants to merge 1 commit intoapache:masterfrom
jtuglu1:task-auth-context

Conversation

@jtuglu1
Copy link
Copy Markdown
Contributor

@jtuglu1 jtuglu1 commented Mar 30, 2026

Closes #18957.

Description

Create a mechanism to propagate user identity context from authenticator to tasks. This allows tasks to safely operate with a sort of per-task identity, allowing tasks to access information dynamically that would be hard to setup in the current state of things. This allows for auth mechanisms like catalog credential vending (through Iceberg Catalog, etc.) and allows Druid to operate more like a standalone engine that can integrate with other things within a larger data ecosystem.

Design

Constraints

This currently does NOT support task restarts.

Release note

Create a mechanism to propagate user identity context from authenticator to tasks to support things like credential vending, etc.

This PR has:

  • been self-reviewed.
  • added documentation for new or modified features or behaviors.
  • a release note entry in the PR description.
  • added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links.
  • added or updated version, license, or notice information in licenses.yaml
  • added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
  • added unit tests or modified existing tests to cover new code paths, ensuring the threshold for code coverage is met.
  • added integration tests.
  • been tested in a test Druid cluster.

@jtuglu1 jtuglu1 changed the title Support user identity forwarding to tasks via TaskAuthContext feat: support user identity forwarding to tasks via TaskAuthContext Mar 30, 2026
@github-actions github-actions bot added Area - Batch Ingestion Area - Dependencies Area - Ingestion Area - MSQ For multi stage queries - https://github.com/apache/druid/issues/12262 labels Mar 30, 2026
@jtuglu1 jtuglu1 force-pushed the task-auth-context branch from f68a5ba to 3255f0b Compare March 30, 2026 23:31
@jtuglu1 jtuglu1 force-pushed the task-auth-context branch 3 times, most recently from f2012ec to 99c006c Compare March 31, 2026 02:13
* @return TaskAuthContext to inject into the task, or null to skip injection
*/
@Nullable
TaskAuthContext createTaskAuthContext(AuthenticationResult authenticationResult, Task task);

Check notice

Code scanning / CodeQL

Useless parameter Note

The parameter 'authenticationResult' is never used.
* @return TaskAuthContext to inject into the task, or null to skip injection
*/
@Nullable
TaskAuthContext createTaskAuthContext(AuthenticationResult authenticationResult, Task task);

Check notice

Code scanning / CodeQL

Useless parameter Note

The parameter 'task' is never used.
@jtuglu1 jtuglu1 force-pushed the task-auth-context branch 4 times, most recently from 1f4e9ff to 09f57a7 Compare April 1, 2026 03:27
@jtuglu1 jtuglu1 added this to the 37.0.0 milestone Apr 1, 2026
@jtuglu1 jtuglu1 force-pushed the task-auth-context branch from 33be042 to 81fae04 Compare April 2, 2026 01:35
@jtuglu1 jtuglu1 force-pushed the task-auth-context branch from 81fae04 to 6ac3139 Compare April 6, 2026 18:25
@cecemei
Copy link
Copy Markdown
Contributor

cecemei commented Apr 7, 2026

@jtuglu1 whats the status for this PR? should we move this to 38?

@jtuglu1
Copy link
Copy Markdown
Contributor Author

jtuglu1 commented Apr 7, 2026

@jtuglu1 whats the status for this PR? should we move this to 38?

This one I wanted to get into v37 – I was hoping to get this merged today.

@jtuglu1 jtuglu1 force-pushed the task-auth-context branch from 6ac3139 to 74d8369 Compare April 7, 2026 18:35
Copy link
Copy Markdown
Contributor

@gianm gianm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I reviewed the core changes only, focusing on trying to understand the security properties of the changes.

I wonder what your thoughts are on an alternate design that keeps the vended credentials inside the input source itself:

  • Add a new method to InputSource like scopeForUser(AuthenticationResult authResult). Default implementation is return this
  • Whenever a task is submitted, call scopeForUser on all of its input sources at whichever service initially accepts the task (either Broker [for SQL DML] or Overlord [for anything else]).
  • The IcebergInputSource would implement scopeForUser to fetch the vended credentials and transform itself into an input source that bakes in the vended credentials. It would use a PasswordProvider so it is redactable.

The idea would be to avoid the need for a new credential vending system in core, putting most of the changes inside the Iceberg extension instead.


/**
* Returns sensitive credentials (e.g., OAuth tokens, API keys).
* This method MUST be redacted during serialization via {@link TaskAuthContextRedactionMixIn}.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is over-eager, isn't it? The credentials must be serialized in some cases. When we send a task spec from Overlord to the server that will actually run the task, the credentials must be included. In some cases the nonredacted task file will need to end up on disk, so it can be run. Perhaps this should say "MUST be redacted when written to log files, the metadata store, or returned from user-facing APIs".

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah the comment here is a bit aggressive, but that's currently what is already done. The field is not redacted when being submitted from Overlord to MM/Peon/Indexer. It is only redacted currently on logging, persistence (to disk or db).

*
* <p><b>Credential lifetime:</b> Vended credentials (OAuth tokens, STS session tokens) have
* limited lifetimes, typically 1-12 hours. There is currently no credential refresh mechanism,
* so long-running tasks may fail when credentials expire. Task restarts are also not supported
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any thoughts on how to handle long-running tasks? I expect it will be a problem. It's not uncommon for tasks to take hours.

Copy link
Copy Markdown
Contributor Author

@jtuglu1 jtuglu1 Apr 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was going to add this in a separate change. Using a task auth context, inputSource could conceivably just use that with some implementable revend() method. The only issue I would want to avoid is having the subtasks refresh the credentials (and not the driver task), since this might cause thundering herd effect where many subtasks all attempt to refresh at once. Spark does a better job of this by letting executor handle credential refresh on the driver, then propagate that to the executors.

*
* @return the identity string
*/
String getIdentity();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The auth context is a @JsonProperty so I believe that means people can set it explicitly when they submit tasks. Does anything bad happen if someone sets a context and sets the identity to someone else? What do you think about clearing the user-provided auth context, if any, in OverlordResource?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does anything bad happen if someone sets a context and sets the identity to someone else?

If we want to scope the auth context to Druid-settable only, then yes we should. IMO, I think that's a safer option but I could see people wanting to use this apart from Authorization result.


FutureUtils.getUnchecked(overlordClient.runTask(taskId, controllerTask), true);
// Propagate auth context headers to Overlord for consumption
if (plannerContext.getAuthenticationResult() != null && plannerContext.getAuthenticationResult().getContext() != null) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the purpose of this? Authentication context is meant to be an arbitrary extension-defined in-memory map. It may not in general take well to being stuffed into a header. It may also include sensitive information that shouldn't be sent in a header.

Copy link
Copy Markdown
Contributor Author

@jtuglu1 jtuglu1 Apr 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need a way to propagate AuthenticationResult from the broker to the overlord. I could make an overridable serialization method/class for AuthenticationResult to ensure we only pass-through what's needed (and let the user specify that), but propagating the headers here was the least intrusive approach for that change. We still need to pass the authentication result somehow.


// Inject auth context if provider is configured
if (taskAuthContextProvider != null) {
final AuthenticationResult authenticationResult = AuthorizationUtils.authenticationResultFromRequest(req);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How will this work in the SQL DML path, where the user submits a task to /druid/v2/sql/task/ and the Broker then submits the task using its own credentials? The current design is that the Broker authenticates the user, authorizes the DML operation, and then submits it to the Overlord using a service account (not the user's own credentials).

IMO it would make more sense in this case for the Broker to get the vended credentials and pass them along to the Overlord.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How will this work in the SQL DML path, where the user submits a task to /druid/v2/sql/task/ and the Broker then submits the task using its own credentials?

That's the thing. In our implementation, the broker passes through the user auth context and not its own credentials, so there's no use of the Broker credentials in the task payload (only used for validating the request came from the Brokers).

*/
@ExtensionPoint
@JsonTypeInfo(use = JsonTypeInfo.Id.NAME, property = "type")
public interface TaskAuthContext
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't see an implementation of TaskAuthContext or TaskAuthContextProvider in this PR. Are they meant to be added later? What would they look like? I was wondering in particular what taskAuthContextProvider.createTaskAuthContext would do exactly with the authenticationResult that is passed in.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They are meant for users to implement. I have internal versions of these classes which take our internal identity tokens and propagate them through to the Iceberg input source, to be then used for vending S3 credentials to read data from Iceberg.

@jtuglu1
Copy link
Copy Markdown
Contributor Author

jtuglu1 commented Apr 8, 2026

I reviewed the core changes only, focusing on trying to understand the security properties of the changes.

I wonder what your thoughts are on an alternate design that keeps the vended credentials inside the input source itself:

  • Add a new method to InputSource like scopeForUser(AuthenticationResult authResult). Default implementation is return this
  • Whenever a task is submitted, call scopeForUser on all of its input sources at whichever service initially accepts the task (either Broker [for SQL DML] or Overlord [for anything else]).
  • The IcebergInputSource would implement scopeForUser to fetch the vended credentials and transform itself into an input source that bakes in the vended credentials. It would use a PasswordProvider so it is redactable.

The idea would be to avoid the need for a new credential vending system in core, putting most of the changes inside the Iceberg extension instead.

Are you proposing pushing the vending of credentials using an identity to the broker/overlord prior to task submission? I'd ideally like to propagate the auth context to the task and have it vend the credentials at runtime, not at submit time.

Currently, the way this works is:

  • We extract user identity
  • We pass to an Iceberg catalog client which uses this identity to then vend S3 credentials
  • These S3 credentials are then attached to the subtasks that spawn from the driver task

@jtuglu1 jtuglu1 requested review from abhishekrb19 and gianm April 8, 2026 18:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

User authentication propagation to ingestion tasks and credential vending

4 participants