Add blog post on AKS Configurable Scheduler Profiles#5505
Add blog post on AKS Configurable Scheduler Profiles#5505colinmixonn wants to merge 93 commits intomasterfrom
Conversation
This blog post introduces AKS Configurable Scheduler Profiles, highlighting their benefits for optimizing resource utilization and improving scheduling strategies for web-distributed and AI workloads. It covers configuration examples for GPU utilization, pod distribution across topology domains, and memory-optimized scheduling.
Added a new tag for Scheduler with relevant details.
Updated blog post on AKS Configurable Scheduler Profiles to improve clarity and correctness, including sections on GPU utilization, pod distribution, and memory-optimized scheduling.
Corrected typos and improved clarity in the blog post about AKS Configurable Scheduler Profiles.
Updated the blog to clarify the objectives of configuring AKS Configurable Scheduler Profiles, improved section titles, and ensured consistency in terminology.
Clarified the objectives and improved the wording in the blog post about AKS Configurable Scheduler Profiles.
website/blog/2025-12-16-aks-config-scheduler-profiles-preview/index.md
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Pull request overview
This pull request adds a new blog post announcing the preview of AKS Configurable Scheduler Profiles, a feature that enables fine-grained control over pod scheduling strategies to optimize resource utilization and improve workload performance.
Key Changes
- Introduces a new "scheduler" tag to categorize blog posts related to pod placement and scheduling optimization
- Adds comprehensive blog post covering three main scheduling use cases: GPU bin-packing for AI workloads, pod distribution across topology domains for resilience, and memory-optimized scheduling with PVC-aware placement
- Provides YAML configuration examples and best practices for implementing custom scheduler profiles
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 20 comments.
| File | Description |
|---|---|
| website/blog/tags.yml | Adds new "scheduler" tag for categorizing posts about pod placement and scheduling techniques |
| website/blog/2025-12-16-aks-config-scheduler-profiles-preview/index.md | New blog post introducing AKS Configurable Scheduler Profiles with configuration examples for GPU utilization, topology distribution, and memory-optimized scheduling |
website/blog/2025-12-16-aks-config-scheduler-profiles-preview/index.md
Outdated
Show resolved
Hide resolved
website/blog/2025-12-16-aks-config-scheduler-profiles-preview/index.md
Outdated
Show resolved
Hide resolved
website/blog/2025-12-16-aks-config-scheduler-profiles-preview/index.md
Outdated
Show resolved
Hide resolved
website/blog/2025-12-16-aks-config-scheduler-profiles-preview/index.md
Outdated
Show resolved
Hide resolved
website/blog/2025-12-16-aks-config-scheduler-profiles-preview/index.md
Outdated
Show resolved
Hide resolved
website/blog/2025-12-16-aks-config-scheduler-profiles-preview/index.md
Outdated
Show resolved
Hide resolved
website/blog/2025-12-16-aks-config-scheduler-profiles-preview/index.md
Outdated
Show resolved
Hide resolved
website/blog/2025-12-16-aks-config-scheduler-profiles-preview/index.md
Outdated
Show resolved
Hide resolved
website/blog/2025-12-16-aks-config-scheduler-profiles-preview/index.md
Outdated
Show resolved
Hide resolved
website/blog/2025-12-16-aks-config-scheduler-profiles-preview/index.md
Outdated
Show resolved
Hide resolved
…index.md Co-authored-by: Diego Casati <diego.casati@gmail.com>
…index.md Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
…index.md Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
…index.md Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
…index.md Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
| apiVersion: aks.azure.com/v1alpha1 | ||
| kind: SchedulerConfiguration | ||
| metadata: | ||
| name: upstream |
There was a problem hiding this comment.
This example uses metadata.name: upstream. If readers apply multiple examples, they'll overwrite the same SchedulerConfiguration object. Use a profile-specific resource name (or call out that the name must be unique per cluster/namespace).
| name: upstream | |
| name: gpu-node-binpacking-scheduler-config |
…configurable-scheduler-binpack-profile.png
| This blog provides examples of three different scheduler profiles and details the benefits of each to increase node utilization for AKS clusters: | ||
|
|
||
| 1. [How to increase AKS cluster GPU utilization](#increase-aks-cluster-gpu-utilization) | ||
| 2. [How to increase AKS cluster CPU utilization](#increase-aks-cluster-cpu-utilization) | ||
|
|
There was a problem hiding this comment.
This says there are "three different scheduler profiles," but only two examples are listed. Either add the third example/section or update the wording to match the actual content.
Updated the blog to reflect the change from three to two scheduler profiles and added an FAQ section addressing common questions about the configurable scheduler profiles.
Expanded the introduction to Configurable Scheduler Profiles on AKS, detailing its benefits and providing examples of two different scheduler profiles to increase node utilization.
Removed redundant sentence and improved clarity in the introduction.
Removed introductory phrase from note about resource weights and parameters.
| apiVersion: aks.azure.com/v1alpha1 | ||
| kind: SchedulerConfiguration | ||
| metadata: | ||
| name: upstream | ||
| spec: | ||
| rawConfig: | |
There was a problem hiding this comment.
This example also uses metadata.name: upstream, which would conflict with the earlier example’s resource name. Please use a unique name here as well (or clarify that the earlier resource should be replaced).
| [concepts-scheduler-configuration]: https://learn.microsoft.com/azure/aks/concepts-scheduler-configuration | ||
| [kueue-overview]: https://learn.microsoft.com/azure/aks/kueue-overview | ||
| [best-practices-advanced-scheduler]: https://learn.microsoft.com/azure/aks/operator-best-practices-advanced-scheduler | ||
| [scheduling-framework/#interfaces]: https://kubernetes.io/docs/concepts/scheduling-eviction/scheduling-framework/#interfaces | ||
| [supported-in-tree-scheduling-plugins]: https://learn.microsoft.com/azure/aks/concepts-scheduler-configuration#supported-in-tree-scheduling-plugins |
There was a problem hiding this comment.
The reference label [scheduling-framework/#interfaces] includes / and #, which can be brittle across Markdown processors. Consider renaming the reference label to a simpler identifier (for example, scheduling-framework-interfaces) while keeping the same URL target.
website/blog/2025-12-16-aks-config-scheduler-profiles-preview/index.md
Outdated
Show resolved
Hide resolved
…index.md Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Updated the description of Configurable Scheduler Profiles to emphasize increased node utilization and clarified the functionality of the scheduling framework. Added information about the accessibility of scheduler configuration starting from Kubernetes version 1.33.
Enhanced the explanation of Kubernetes scheduler operations and the benefits of Configurable Scheduler Profiles on AKS. Clarified the impact of scheduling strategies on resource utilization and operational complexity.
| apiVersion: aks.azure.com/v1alpha1 | ||
| kind: SchedulerConfiguration | ||
| metadata: | ||
| name: upstream |
There was a problem hiding this comment.
This example uses metadata.name: upstream, and the CPU example later uses the same name. If readers apply both, they will conflict/overwrite. Consider giving each example a unique metadata.name to avoid copy/paste issues.
| name: upstream | |
| name: gpu-node-binpacking-scheduler |
Added section on increasing node utilization and operator control with configurable scheduler profiles.
…config-scheduler-profiles.png
| - name: NodeResourcesBalancedAllocation | ||
| pluginConfig: | ||
| - name: NodeResourcesFit | ||
| args: |
There was a problem hiding this comment.
The GPU example’s NodeResourcesFit args omits the typed wrapper (apiVersion/kind: NodeResourcesFitArgs) that you include in the CPU example. For documentation copy/paste reliability, keep both examples consistent with the kube-scheduler config schema (add apiVersion/kind in the first example as well) to avoid readers hitting validation/parsing errors depending on tooling.
| args: | |
| args: | |
| apiVersion: kubescheduler.config.k8s.io/v1 | |
| kind: NodeResourcesFitArgs |
| score: 0 | ||
| ``` | ||
| ### FAQ | ||
| 1. How does this interact with autoscalers such as Node Auto Provisioning (NAP), Cluster Autoscaler (CAS), and Vertical Pod Autoscaler (VPA)? |
There was a problem hiding this comment.
FAQ item #1 is a question without an answer. Either add an explicit answer (even if it’s a short guidance + link) or remove the question to avoid confusing readers.
| 1. How does this interact with autoscalers such as Node Auto Provisioning (NAP), Cluster Autoscaler (CAS), and Vertical Pod Autoscaler (VPA)? | |
| 1. How does this interact with autoscalers such as Node Auto Provisioning (NAP), Cluster Autoscaler (CAS), and Vertical Pod Autoscaler (VPA)? Configurable Scheduler Profiles control how pods are placed on existing nodes, while autoscalers still decide when to add, remove, or resize nodes based on resource demand. You should validate new profiles with your current NAP, CAS, and VPA settings in a test cluster to ensure they work together as expected before rolling them out to production. |
This blog post introduces AKS Configurable Scheduler Profiles, highlighting their benefits for optimizing resource utilization and improving scheduling strategies for web-distributed and AI workloads. It covers configuration examples for GPU utilization, pod distribution across topology domains, and memory-optimized scheduling.