Skip to content

fix: harden ingesting autoscalers around task-count boundaries#19269

Open
Fly-Style wants to merge 3 commits intoapache:masterfrom
Fly-Style:lag-based-autoscaler
Open

fix: harden ingesting autoscalers around task-count boundaries#19269
Fly-Style wants to merge 3 commits intoapache:masterfrom
Fly-Style:lag-based-autoscaler

Conversation

@Fly-Style
Copy link
Copy Markdown
Contributor

@Fly-Style Fly-Style commented Apr 7, 2026

This PR:

  • fix lag-based autoscaler by using taskCount from ioConfig for scale action calculations instead of activeTaskGroups;
  • hardens seekable-stream autoscalers when a supervisor is configured with a handwritten taskCount outside the allowed bounds. For both cost-based and lag-based autoscalers, if the current taskCount is below taskCountMin or above taskCountMax, the scaler now returns the nearest valid boundary instead of using the out-of-range value as the scaling baseline. This keeps supervisors within configured limits and avoids inconsistent scaling decisions.

This PR has:

  • been self-reviewed.
  • a release note entry in the PR description.
  • added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
  • added unit tests or modified existing tests to cover new code paths, ensuring the threshold for code coverage is met.

@Fly-Style Fly-Style changed the title bug: lag-based autoscaler: use taskCount from ioConfig for scale action instead of activeTaskGroups bug: lag-based autoscaler: use taskCount from ioConfig for scale action instead of activeTaskGroups Apr 7, 2026
@Fly-Style Fly-Style changed the title bug: lag-based autoscaler: use taskCount from ioConfig for scale action instead of activeTaskGroups fix: lag-based autoscaler: use taskCount from ioConfig for scale action instead of activeTaskGroups Apr 7, 2026
@Fly-Style Fly-Style force-pushed the lag-based-autoscaler branch from 4977473 to 746c760 Compare April 7, 2026 12:36
@Fly-Style Fly-Style requested a review from jtuglu1 April 7, 2026 13:16
Copy link
Copy Markdown
Contributor

@amaechler amaechler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🐻

@Fly-Style

This comment was marked as outdated.

@Fly-Style Fly-Style changed the title fix: lag-based autoscaler: use taskCount from ioConfig for scale action instead of activeTaskGroups fix: harden ingesting autoscalers around task-count boundaries Apr 8, 2026
@Fly-Style
Copy link
Copy Markdown
Contributor Author

cc @zhangyue19921010

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants