Skip to content

[linux-nvidia-6.17]: Replace CPPC Autonomous Series with Version That Has Been Accepted Upstream#346

Draft
jamieNguyenNVIDIA wants to merge 18 commits intoNVIDIA:24.04_linux-nvidia-6.17-nextfrom
jamieNguyenNVIDIA:jamien/replace-cppc
Draft

[linux-nvidia-6.17]: Replace CPPC Autonomous Series with Version That Has Been Accepted Upstream#346
jamieNguyenNVIDIA wants to merge 18 commits intoNVIDIA:24.04_linux-nvidia-6.17-nextfrom
jamieNguyenNVIDIA:jamien/replace-cppc

Conversation

@jamieNguyenNVIDIA
Copy link
Copy Markdown
Collaborator

@jamieNguyenNVIDIA jamieNguyenNVIDIA commented Mar 17, 2026

Replace 8 NVIDIA SAUCE patches for ACPI CPPC / cppc_cpufreq with their upstream equivalents from linux-next. This brings the CPPC autonomous selection and performance control support in line with the accepted upstream implementation, reducing our out-of-tree maintenance burden.

The branch reverts the existing SAUCE patches in dependency order, then applies 10 upstream cherry-picks/backports (8 from the accepted Part 1 series, 1 prerequisite EPP constant rename, and 1 follow-up auto_sel_mode boot parameter).

LP: https://bugs.launchpad.net/bugs/2131705

@jamieNguyenNVIDIA
Copy link
Copy Markdown
Collaborator Author

test_cppc_cpufreq.sh

Attaching test script used to verify this PR.

Results from Spark:

cppc_cpufreq test suite
========================================
Tue Mar 17 11:10:44 AM PDT 2026
6.17.0-1012-nvidia-64k


Test 1: Basic driver load and sysfs layout
  PASS: scaling_driver is cppc_cpufreq (got 'cppc_cpufreq')
  PASS: governor is set (got 'performance', not '')
  PASS: auto_select sysfs exists
  PASS: auto_act_window sysfs exists
  PASS: energy_performance_preference_val sysfs exists
  PASS: perf_limited sysfs exists
  PASS: min_perf sysfs removed
  PASS: max_perf sysfs removed

Test 2: auto_sel_mode boot parameter
  auto_sel_mode=Y
  PASS: boot param is read-only (0444) (got '444')
./test_cppc_cpufreq.sh: line 80: /sys/module/cppc_cpufreq/parameters/auto_sel_mode: Permission denied
  PASS: boot param rejects writes
  PASS: auto_select enabled when boot param=Y (got '1')
  PASS: EPP set to performance (0) by boot param (got '0')
  PASS: auto_select=1 on all CPUs

Test 3: Runtime auto_select toggle via sysfs
  PASS: enable auto_select (got '1')
  PASS: disable auto_select (got '0')
  PASS: governor changes accepted after auto_select disable (perf=2808000, other=338000)

Test 4: MIN_PERF/MAX_PERF via scaling_min/max_freq
  cpuinfo range: 338000-2808000 kHz
  PASS: scaling_max_freq accepts midpoint (got '1573000')
  PASS: scaling_min_freq accepts min (got '338000')
  PASS: frequency clamped by scaling_max_freq (1573000 <= ~1573000)

Test 5: Energy Performance Preference
  PASS: write EPP=0 (performance)
  PASS: read back EPP=0 (got '0')
  PASS: write EPP=255 (energy-efficiency)
  PASS: read back EPP=255 (got '255')
  PASS: write EPP=128 (balanced)
  PASS: read back EPP=128 (got '128')
  PASS: reject EPP=256 (out of range)

Test 6: perf_limited register
  PASS: perf_limited readable (value=0)
  PASS: clear bit 0 (desired excursion)
  PASS: clear bit 1 (minimum excursion)
  PASS: clear both bits
  PASS: zero is valid no-op
  PASS: reject 0x4 (invalid bit)
  PASS: reject 0xff (invalid bits)

Test 7: CPU hotplug
  PASS: cpu1 offlined
  PASS: cpu1 re-onlined
  PASS: auto_select readable after hotplug (value=1)
  PASS: driver restored after hotplug (got 'cppc_cpufreq')

Test 8: auto_act_window sysfs
  PASS: auto_act_window readable (value=1)
  PASS: write auto_act_window=0

Test 9: Diagnostics in dmesg
  PASS: no DESIRED_PERF warning (firmware is compliant)
  PASS: no kernel errors related to CPPC (got '0')

Test 10: Stress / regression
  Rapid governor switching (performance <-> schedutil, 100 iterations)...
  PASS: 100 governor switches completed
  Rapid auto_select toggling (100 iterations)...
  PASS: 100 auto_select toggles completed
  Rapid scaling_max_freq changes (50 iterations)...
  PASS: 50 scaling_max_freq changes completed
  PASS: no kernel warnings after stress (got '0')

========================================
Results: 45 tests: 45 passed, 0 failed, 0 skipped
ALL TESTS PASSED

Results from GH:

cppc_cpufreq test suite
========================================
Tue Mar 17 18:22:52 UTC 2026
6.17.0-1012-nvidia-64k


Test 1: Basic driver load and sysfs layout
  PASS: scaling_driver is cppc_cpufreq (got 'cppc_cpufreq')
  PASS: governor is set (got 'performance', not '')
  PASS: auto_select sysfs exists
  PASS: auto_act_window sysfs exists
  PASS: energy_performance_preference_val sysfs exists
  PASS: perf_limited sysfs exists
  PASS: min_perf sysfs removed
  PASS: max_perf sysfs removed

Test 2: auto_sel_mode boot parameter
  auto_sel_mode=N
  PASS: boot param is read-only (0444) (got '444')
./test_cppc_cpufreq.sh: line 80: /sys/module/cppc_cpufreq/parameters/auto_sel_mode: Permission denied
  PASS: boot param rejects writes
  SKIP: auto_sel_mode not enabled (boot without param to test this)

Test 3: Runtime auto_select toggle via sysfs
  SKIP: auto_select not supported

Test 4: MIN_PERF/MAX_PERF via scaling_min/max_freq
  cpuinfo range: 81000-3384000 kHz
  PASS: scaling_max_freq accepts midpoint (got '1732500')
  PASS: scaling_min_freq accepts min (got '81000')
  PASS: frequency clamped by scaling_max_freq (1732500 <= ~1732500)

Test 5: Energy Performance Preference
  SKIP: EPP not supported on this platform

Test 6: perf_limited register
  PASS: perf_limited readable (value=0)
  PASS: clear bit 0 (desired excursion)
  PASS: clear bit 1 (minimum excursion)
  PASS: clear both bits
  PASS: zero is valid no-op
  PASS: reject 0x4 (invalid bit)
  PASS: reject 0xff (invalid bits)

Test 7: CPU hotplug
  PASS: cpu1 offlined
  PASS: cpu1 re-onlined
  PASS: auto_select readable after hotplug (value=<unsupported>)
  PASS: driver restored after hotplug (got 'cppc_cpufreq')

Test 8: auto_act_window sysfs
  SKIP: auto_act_window not supported on this platform

Test 9: Diagnostics in dmesg
  PASS: no DESIRED_PERF warning (firmware is compliant)
  PASS: no kernel errors related to CPPC (got '0')

Test 10: Stress / regression
  Rapid governor switching (performance <-> schedutil, 100 iterations)...
  PASS: 100 governor switches completed
  SKIP: auto_select not supported for toggle stress
  Rapid scaling_max_freq changes (50 iterations)...
  PASS: 50 scaling_max_freq changes completed
  PASS: no kernel warnings after stress (got '0')

========================================
Results: 34 tests: 29 passed, 0 failed, 5 skipped
ALL TESTS PASSED

@arighi
Copy link
Copy Markdown
Collaborator

arighi commented Mar 17, 2026

The commits look good, but technically they're still NVIDIA: SAUCE since they come from linux-next (for now).

So, if we follow the "Canonical stable kernel team" style, we should revert the old ones and re-apply the new ones, still as NVIDIA: SAUCE. If we follow the "Canonical devel kernel team" workflow, we should just drop the old patches (no revert, just a rebase + remove) and apply only the new ones. But the latter isn't really compatible with a PR...

Personally I think I like the rebase+drop approach more, because. moving forward with kernel versions, old patches may have conflicts, so we may end up spending time fixing the conflicts to essentially revert the patch later and re-apply a new one.

@clsotog
Copy link
Copy Markdown
Collaborator

clsotog commented Mar 18, 2026

The last commit needs the Sauce. Its not upstream yet?

@jamieNguyenNVIDIA
Copy link
Copy Markdown
Collaborator Author

The last commit needs the Sauce. Its not upstream yet?

Correct. Sumit only sent it yesterday, and it has not received feedback. I sent this PR early, and plan to adjust these commits as things develop upstream.

@nvmochs
Copy link
Copy Markdown
Collaborator

nvmochs commented Mar 23, 2026

I think we'd also want to include this other patch that Sumit also just posted (https://lore.kernel.org/all/20260318095005.2437960-1-sumitg@nvidia.com/), which was a follow-up to
ACPI: CPPC: Add cppc_get_perf() API to read performance controls

jamieNguyenNVIDIA and others added 18 commits March 23, 2026 12:02
…ter support"

This reverts commit b0527bd.

It is to be replaced the following upstream series:

    https://lore.kernel.org/lkml/48b52f98-119e-4693-806b-78d47f7a43bb@nvidia.com/

Signed-off-by: Jamie Nguyen <jamien@nvidia.com>
…ling auto_select"

This reverts commit f106662.

It is to be replaced the following upstream series:

    https://lore.kernel.org/lkml/48b52f98-119e-4693-806b-78d47f7a43bb@nvidia.com/

Signed-off-by: Jamie Nguyen <jamien@nvidia.com>
…perf_limited"

This reverts commit c5a62d1.

It is to be replaced the following upstream series:

    https://lore.kernel.org/lkml/48b52f98-119e-4693-806b-78d47f7a43bb@nvidia.com/

Signed-off-by: Jamie Nguyen <jamien@nvidia.com>
…erf_limited register"

This reverts commit 5a35a52.

It is to be replaced the following upstream series:

    https://lore.kernel.org/lkml/48b52f98-119e-4693-806b-78d47f7a43bb@nvidia.com/

Signed-off-by: Jamie Nguyen <jamien@nvidia.com>
…in/max_perf"

This reverts commit e0f2e26.

It is to be replaced the following upstream series:

    https://lore.kernel.org/lkml/48b52f98-119e-4693-806b-78d47f7a43bb@nvidia.com/

Signed-off-by: Jamie Nguyen <jamien@nvidia.com>
…d epp"

This reverts commit a3b460e.

It is to be replaced the following upstream series:

    https://lore.kernel.org/lkml/48b52f98-119e-4693-806b-78d47f7a43bb@nvidia.com/

Signed-off-by: Jamie Nguyen <jamien@nvidia.com>
…rformance controls"

This reverts commit 10ff86b.

It is to be replaced the following upstream series:

    https://lore.kernel.org/lkml/48b52f98-119e-4693-806b-78d47f7a43bb@nvidia.com/

Signed-off-by: Jamie Nguyen <jamien@nvidia.com>
…how/store"

This reverts commit 6a55754.

It is to be replaced the following upstream series:

    https://lore.kernel.org/lkml/48b52f98-119e-4693-806b-78d47f7a43bb@nvidia.com/

Signed-off-by: Jamie Nguyen <jamien@nvidia.com>
 - Remove redundant energy_perf field from 'struct cppc_perf_caps' as
   the same is available in 'struct cppc_perf_ctrls' which is used.

 - Move the 'auto_sel' field from 'struct cppc_perf_caps' to
   'struct cppc_perf_ctrls' as it represents a control register.

Signed-off-by: Sumit Gupta <sumitg@nvidia.com>
Reviewed-by: Pierre Gondois <pierre.gondois@arm.com>
Reviewed-by: Lifeng Zheng <zhenglifeng1@huawei.com>
Link: https://patch.msgid.link/20260120145623.2959636-3-sumitg@nvidia.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
(cherry picked from commit 7cb6f10)
Signed-off-by: Jamie Nguyen <jamien@nvidia.com>
Add cppc_get_perf() function to read values of performance control
registers including desired_perf, min_perf, max_perf, energy_perf,
and auto_sel.

This provides a read interface to complement the existing
cppc_set_perf() write interface for performance control registers.

Note that auto_sel is read by cppc_get_perf() but not written by
cppc_set_perf() to avoid unintended mode changes during performance
updates. It can be updated with existing dedicated cppc_set_auto_sel()
API.

Use cppc_get_perf() in cppc_cpufreq_get_cpu_data() to initialize
perf_ctrls with current hardware register values during cpufreq
policy initialization.

Signed-off-by: Sumit Gupta <sumitg@nvidia.com>
Reviewed-by: Pierre Gondois <pierre.gondois@arm.com>
Reviewed-by: Lifeng Zheng <zhenglifeng1@huawei.com>
(cherry picked from commit 658fa7b1c47a857af484c5c5dff8d0164b7c7bfb)
Signed-off-by: Jamie Nguyen <jamien@nvidia.com>
Add a warning during CPPC processor probe if the Desired Performance
register is not supported when it should be.

As per 8.4.6.1.2.3 section of ACPI 6.6 specification,
"The Desired Performance Register is optional only when OSPM indicates
support for CPPC2 in the platform-wide _OSC capabilities and the
Autonomous Selection Enable field is encoded as an Integer with a
value of 1."

In other words:
- In CPPC v1, DESIRED_PERF is mandatory
- In CPPC v2, it becomes optional only when AUTO_SEL_ENABLE is supported

This helps detect firmware configuration issues early during boot.

Link: https://lore.kernel.org/lkml/9fa21599-004a-4af8-acc2-190fd0404e35@nvidia.com/
Suggested-by: Pierre Gondois <pierre.gondois@arm.com>
Signed-off-by: Sumit Gupta <sumitg@nvidia.com>
Reviewed-by: Pierre Gondois <pierre.gondois@arm.com>
Reviewed-by: Lifeng Zheng <zhenglifeng1@huawei.com>
(cherry picked from commit b3e45fb2db9d8a733e94b315f1272e2c4468ed4b)
Signed-off-by: Jamie Nguyen <jamien@nvidia.com>
Extend cppc_set_epp_perf() to write both auto_sel and energy_perf
registers when they are in FFH or SystemMemory address space.

This keeps the behavior consistent with PCC case where both registers
are already updated together, but was missing for FFH/SystemMemory.

Signed-off-by: Sumit Gupta <sumitg@nvidia.com>
Reviewed-by: Pierre Gondois <pierre.gondois@arm.com>
Reviewed-by: Lifeng Zheng <zhenglifeng1@huawei.com>
(cherry picked from commit 38428a680026c52a1fc64212325d161974c3e4cf)
Signed-off-by: Jamie Nguyen <jamien@nvidia.com>
Update the cached perf_ctrls values in cppc_cpudata when auto_sel
or energy_perf are written through sysfs. This ensures the cached
values stay in sync with what was written to the hardware registers.

Without this fix, the cached values become stale after sysfs writes,
which can lead to incorrect values being used when other operations
read from the cache (e.g., when updating MIN_PERF/MAX_PERF).

Signed-off-by: Sumit Gupta <sumitg@nvidia.com>
Reviewed-by: Pierre Gondois <pierre.gondois@arm.com>
Reviewed-by: Lifeng Zheng <zhenglifeng1@huawei.com>
(backported from commit 24ad4c6c136bdaa4c92c5c5948856752ce3e9f76)
[jamien: adapted for tree without CPPC_CPUFREQ_ATTR_RW_U64 macro]
Signed-off-by: Jamie Nguyen <jamien@nvidia.com>
Update MIN_PERF and MAX_PERF registers from policy->min and policy->max
in the .target() and .fast_switch() callbacks. This allows controlling
performance bounds via standard scaling_min_freq and scaling_max_freq
sysfs interfaces.

Similar to intel_cpufreq which updates HWP min/max limits in .target(),
cppc_cpufreq now programs MIN_PERF/MAX_PERF along with DESIRED_PERF.
Since MIN_PERF/MAX_PERF can be updated even when auto_sel is disabled,
they are updated unconditionally.

Also program MIN_PERF/MAX_PERF in store_auto_select() when enabling
autonomous selection so the platform uses correct bounds immediately.

Suggested-by: Rafael J. Wysocki <rafael@kernel.org>
Signed-off-by: Sumit Gupta <sumitg@nvidia.com>
(cherry picked from commit ea3db45ae476889a1ba0ab3617e6afdeeefbda3d)
Signed-off-by: Jamie Nguyen <jamien@nvidia.com>
Add sysfs interface to read/write the Performance Limited register.

The Performance Limited register indicates to the OS that an
unpredictable event (like thermal throttling) has limited processor
performance. It contains two sticky bits set by the platform:
  - Bit 0 (Desired_Excursion): Set when delivered performance is
    constrained below desired performance. Not used when Autonomous
    Selection is enabled.
  - Bit 1 (Minimum_Excursion): Set when delivered performance is
    constrained below minimum performance.

These bits remain set until OSPM explicitly clears them. The write
operation accepts a bitmask of bits to clear:
  - Write 0x1 to clear bit 0
  - Write 0x2 to clear bit 1
  - Write 0x3 to clear both bits

This enables users to detect if platform throttling impacted a workload.
Users clear the register before execution, run the workload, then check
afterward - if set, hardware throttling occurred during that time window.

The interface is exposed as:
  /sys/devices/system/cpu/cpuX/cpufreq/perf_limited

Signed-off-by: Sumit Gupta <sumitg@nvidia.com>
Reviewed-by: Pierre Gondois <pierre.gondois@arm.com>
Reviewed-by: Lifeng Zheng <zhenglifeng1@huawei.com>
(backported from commit 13c45a26635fa51a68911aa57e6778bdad18b103)
[jamien: adapted for tree without CPPC_CPUFREQ_ATTR_RW_U64 macro - used explicit
show/store functions instead]
Signed-off-by: Jamie Nguyen <jamien@nvidia.com>
Add ABI documentation for the Performance Limited Register sysfs
interface in the cppc_cpufreq driver.

Signed-off-by: Sumit Gupta <sumitg@nvidia.com>
Reviewed-by: Randy Dunlap <rdunlap@infradead.org>
Reviewed-by: Pierre Gondois <pierre.gondois@arm.com>
Reviewed-by: Lifeng Zheng <zhenglifeng1@huawei.com>
(cherry picked from commit 856250ba2e810e772dc95b3234ebf0d6393a51d9)
Signed-off-by: Jamie Nguyen <jamien@nvidia.com>
Add kernel boot parameter 'cppc_cpufreq.auto_sel_mode' to enable CPPC
autonomous performance selection at system startup. When autonomous mode
is enabled, the hardware automatically adjusts CPU performance based on
workload demands using Energy Performance Preference (EPP) hints.

This parameter allows to configure the autonomous mode on all CPUs
without requiring runtime sysfs manipulation if the 'auto_sel' register
is present.

When auto_sel_mode=1:
- All CPUs are configured for autonomous operation during module init
- EPP is set to performance preference (0x0) by default
- Min/max performance bounds use defaults
- CPU frequency scaling is handled by hardware instead of OS governor

For Documentation/:
Reviewed-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Sumit Gupta <sumitg@nvidia.com>
(backported from https://lore.kernel.org/lkml/20260317151053.2361475-1-sumitg@nvidia.com/)
[jamien: adapted for the upstream struct layout where auto_sel moved from
cppc_perf_caps to cppc_perf_ctrls, and inlined the init-path logic previously
handled by removed SAUCE helpers]
Signed-off-by: Jamie Nguyen <jamien@nvidia.com>
…ently

Callers of cpc_read() ignore its return value, which can lead
to using uninitialized or stale values when the read fails.

Fix this by consistently checking cpc_read() return values in
cppc_get_perf_caps(), cppc_get_perf_ctrs(), and cppc_get_perf().

Link: https://lore.kernel.org/lkml/48bdf87e-39f1-402f-a7dc-1a0e1e7a819d@nvidia.com/
Suggested-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Sumit Gupta <sumitg@nvidia.com>
(backported from https://lore.kernel.org/all/20260318095005.2437960-1-sumitg@nvidia.com/)
[jamien: adapted for tree without reference_perf handling in
cppc_get_perf_caps(), and with additional ref_perf read in
cppc_get_perf_ctrs()]
Signed-off-by: Jamie Nguyen <jamien@nvidia.com>
@jamieNguyenNVIDIA
Copy link
Copy Markdown
Collaborator Author

The commits look good, but technically they're still NVIDIA: SAUCE since they come from linux-next (for now).

So, if we follow the "Canonical stable kernel team" style, we should revert the old ones and re-apply the new ones, still as NVIDIA: SAUCE. If we follow the "Canonical devel kernel team" workflow, we should just drop the old patches (no revert, just a rebase + remove) and apply only the new ones. But the latter isn't really compatible with a PR...

Personally I think I like the rebase+drop approach more, because. moving forward with kernel versions, old patches may have conflicts, so we may end up spending time fixing the conflicts to essentially revert the patch later and re-apply a new one.

Thanks. It looks like the linux-next patches have made it into mainline so I've updated these references accordingly.

I'll ask Canonical about doing the rebase+drop approach for this series.

@jamieNguyenNVIDIA
Copy link
Copy Markdown
Collaborator Author

I think we'd also want to include this other patch that Sumit also just posted (https://lore.kernel.org/all/20260318095005.2437960-1-sumitg@nvidia.com/), which was a follow-up to ACPI: CPPC: Add cppc_get_perf() API to read performance controls

Thanks. I've backported this patch in the latest version.

@clsotog clsotog self-requested a review March 24, 2026 21:03
@clsotog clsotog self-requested a review March 24, 2026 21:03
Copy link
Copy Markdown
Collaborator

@clsotog clsotog left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The last changes looks ok.
Acked-by: Carol L Soto <csoto@nvidia.com>

@jamieNguyenNVIDIA jamieNguyenNVIDIA marked this pull request as draft March 24, 2026 21:16
@jamieNguyenNVIDIA
Copy link
Copy Markdown
Collaborator Author

Converted to draft as the final commits are still being reviewed by the community:

  1. 4092db4
  2. 4e03dbd

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants