[linux-nvidia-6.17] Backport MPAM fixes and support for CPU-less NUMA nodes by fyu1 · Pull Request #348 · NVIDIA/NV-Kernels

fyu1 · 2026-03-20T18:07:13Z

This PR replaces #328

This branch fixes a few MPAM issues including:

Performance issue due to small MBW_MIN on Grace: https://nvbugspro.nvidia.com/bug/5928376
Performance issue due to 0 CMAX on Vera: https://nvbugspro.nvidia.com/bug/5717435
Stress Online/offline issue on Vera: https://nvbugspro.nvidia.com/bug/5919525
Clean up numa node MBA/MBM code to avoid future issues.

There are total 49 patches:

The first 10 patches revert ARM's extra patches which are numa node, event filter, and mem hotplug patches. The patches are buggy and cause most of the above issues.
The patches 11 and 12 revert old buggy T241-MPAM-4 Grace erratum workaround and apply an updated one.
The patches 13-42 are from resctrl upstream for mainly alignment of monitoring type for the later numa patches.
The patches 43-49 are mainly supporting CPU-less and numa node, plus fixing IOMMU, MSC tear down, MBWU type issues.

This is patches list:
0001-Revert-NVIDIA-SAUCE-untested-arm_mpam-resctrl-Allow-.patch
0002-Revert-NVIDIA-SAUCE-arm_mpam-resctrl-Add-NUMA-node-n.patch
0003-Revert-NVIDIA-SAUCE-untested-arm_mpam-resctrl-Split-.patch
0004-Revert-NVIDIA-SAUCE-arm_mpam-resctrl-Change-domain_h.patch
0005-Revert-NVIDIA-SAUCE-arm_mpam-resctrl-Pick-whether-MB.patch
0006-Revert-NVIDIA-SAUCE-Fix-unused-variable-warning.patch
0007-Revert-NVIDIA-SAUCE-fs-resctrl-Add-mount-option-for-.patch
0008-Revert-NVIDIA-SAUCE-fs-resctrl-Take-memory-hotplug-l.patch
0009-Revert-NVIDIA-SAUCE-mm-memory_hotplug-Add-lockdep-as.patch
0010-Revert-NVIDIA-SAUCE-untested-arm_mpam-resctrl-Allow-.patch
0011-Revert-NVIDIA-SAUCE-arm_mpam-Add-workaround-for-T241.patch
0012-NVIDIA-SAUCE-arm_mpam-Add-workaround-for-T241-MPAM-4.patch
0013-x86-fs-resctrl-Improve-domain-type-checking.patch
0014-x86-resctrl-Move-L3-initialization-into-new-helper-f.patch
0015-x86-resctrl-Refactor-domain_remove_cpu_mon-ready-for.patch
0016-x86-resctrl-Clean-up-domain_remove_cpu_ctrl.patch
0017-x86-fs-resctrl-Refactor-domain-create-remove-using-s.patch
0018-fs-resctrl-Split-L3-dependent-parts-out-of-mon_eve.patch
0019-x86-fs-resctrl-Use-struct-rdt_domain_hdr-when-readin.patch
0020-x86-fs-resctrl-Rename-struct-rdt_mon_domain-and-rdt.patch
0021-x86-fs-resctrl-Rename-some-L3-specific-functions.patch
0022-fs-resctrl-Make-event-details-accessible-to-function.patch
0023-x86-fs-resctrl-Handle-events-that-can-be-read-from-a.patch
0024-x86-fs-resctrl-Support-binary-fixed-point-event-coun.patch
0025-x86-fs-resctrl-Add-an-architectural-hook-called-for-.patch
0026-x86-fs-resctrl-Add-and-initialize-a-resource-for-pac.patch
0027-fs-resctrl-Emphasize-that-L3-monitoring-resource-is-.patch
0028-x86-resctrl-Discover-hardware-telemetry-events.patch
0029-x86-fs-resctrl-Fill-in-details-of-events-for-perform.patch
0030-x86-fs-resctrl-Add-architectural-event-pointer.patch
0031-x86-resctrl-Find-and-enable-usable-telemetry-events.patch
0032-x86-resctrl-Read-telemetry-events.patch
0033-fs-resctrl-Refactor-mkdir_mondata_subdir.patch
0034-fs-resctrl-Refactor-rmdir_mondata_subdir_allrdtgrp.patch
0035-x86-fs-resctrl-Handle-domain-creation-deletion-for-R.patch
0036-x86-resctrl-Add-energy-perf-choices-to-rdt-boot-opti.patch
0037-x86-resctrl-Handle-number-of-RMIDs-supported-by-RDT.patch
0038-fs-resctrl-Move-allocation-free-of-closid_num_dirty_.patch
0039-x86-fs-resctrl-Compute-number-of-RMIDs-as-minimum-ac.patch
0040-fs-resctrl-Move-RMID-initialization-to-first-mount.patch
0041-x86-resctrl-Enable-RDT_RESOURCE_PERF_PKG.patch
0042-x86-fs-resctrl-Update-documentation-for-telemetry-ev.patch
0043-NVIDIA-VR-SAUCE-arm_mpam-Fix-compilation-errors.patch
0044-NVIDIA-SAUCE-arm_mpam-Avoid-MSC-teardown-for-the-SW-.patch
0045-NVIDIA-VR-SAUCE-arm_mpam-Handle-CPU-less-numa-nodes.patch
0046-NVIDIA-VR-SAUCE-arm_mpam-Include-all-associated-MSC-.patch
0047-NVIDIA-SAUCE-resctrl-mpam-reset-RIS-by-applying-expl.patch
0048-NVIDIA-SAUCE-iommu-arm-smmu-v3-Fix-MPAM-for-indentit.patch
0049-NVIDIA-VR-SAUCE-arm_mpam-Resolve-MBWU-type-before-fe.patch

Test results are in http://10.112.214.86/vera/tests/ including

init registers test
iommu assignment test
online/offline test
Spec2017 performance test
CXL test

GPU MPAM test is not covered because as of now there is SBIOS support for the feature yet.

LP: https://bugs.launchpad.net/ubuntu/+source/linux-nvidia-6.17/+bug/2146389

nirmoy · 2026-03-23T12:38:16Z

These patches doesn't match upstream commit
8f62caa: x86/resctrl: Add energy/perf choices to rdt boot option
d0f8995: fs/resctrl: Move RMID initialization to first mount
for example:

 git range-diff 8f62caa8be62~1..8f62caa8be62 842e7f97d71a~1..842e7f97d71a
      ## Documentation/admin-guide/kernel-parameters.txt ##
    -@@
    +@@ Documentation/admin-guide/kernel-parameters.txt: Kernel parameters
    +   rdt=            [HW,X86,RDT]
                        Turn on/off individual RDT features. List is:
                        cmt, mbmtotal, mbmlocal, l3cat, l3cdp, l2cat, l2cdp,
    -                   mba, smba, bmec, abmc.
    +-                  mba, smba, bmec, abmc, sdciae.
     +                  mba, smba, bmec, abmc, sdciae, energy[:guid],
     +                  perf[:guid].
                        E.g. to turn on cmt and turn off mba use:

please make sure to add a comment if some upstream cherry-pick needed conflict fixes.
otherwise the series looks fine. Tested with older SBIOS

sudo /tmp/mpam-ok
[12:37:29] ============================================
[12:37:29]   MPAM Feature Validation (mpam-ok)
[12:37:29] ============================================
[12:37:29] Kernel:  6.17.9+
[12:37:29] Arch:    aarch64
[12:37:29] Date:    Mon Mar 23 12:37:29 PM UTC 2026
[12:37:29]
[12:37:29] Building memory workload ...
[12:37:29] Workload: /tmp/mpam_wl_e285ki (512 MB, 10s)
[12:37:29] INFO: STREAM binary not provided (-s); STREAM-based BW checks will be skipped
[12:37:29] INFO: MBA throttle test will rely on MBM counters only
[12:37:29] INFO: Install a STREAM benchmark binary and pass -s /path/to/stream for BW test
[12:37:29]
[12:37:29] --- Test: MPAM kernel support ---
[12:37:29] PASS  MPAM enabled: MPAM enabled with 47 PARTIDs and 2 PMGs
[12:37:29] --- Test: resctrl filesystem ---
[12:37:29] PASS  resctrl mounted successfully
[12:37:29] --- Test: NUMA topology ---
[12:37:29] PASS  NUMA: 2 node(s), 0 CPU-less, 0 CXL
[12:37:29] --- Test: resctrl resource info ---
[12:37:29] PASS  Resource: L3
[12:37:29] PASS  Resource: L3_MAX
[12:37:30] PASS  Resource: L3_MON
[12:37:30] PASS  Resource: MB_MON
[12:37:30] --- Test: schemata entries ---
[12:37:30] PASS  L3 allocation: 2 domain(s)
[12:37:30] FAIL  MBA allocation missing (monitoring exists with 2 domain(s) -- NUMA-based MSC support likely incomplete)
[12:37:30] --- Test: resctrl partition ---
[12:37:30] PASS  Created partition 'mpam_ok_18427', assigned PID 18427
[12:37:30] --- Test: monitor directories ---
[12:37:30] PASS  MB monitor directories: 2
[12:37:30] PASS  L3 monitor directories: 2
[12:37:30] --- Test: monitoring counters ---
[12:37:30] PASS  MBM readable: domain 0 (Unassigned bytes)
[12:37:30] PASS  MBM readable: domain 1 (Unassigned bytes)
[12:37:30] PASS  L3 occupancy readable (0 bytes)
[12:37:30] --- Test: MPAM schemata defaults (regression check, bugs 5717435/5928376) ---
[12:37:30] PASS  L3_MAX defaults safe: L3_MAX:1=100;2=100
[12:37:30] --- Test: MBA schemata write/readback ---
[12:37:30] SKIP  MBA schemata (no allocation domains)
[12:37:30] --- Test: MBM traffic detection ---
[12:37:30] --- Cleanup ---

drivers/resctrl/mpam_devices.c

drivers/resctrl/mpam_internal.h

drivers/resctrl/mpam_devices.c

drivers/resctrl/mpam_resctrl.c

drivers/resctrl/mpam_devices.c

fyu1 · 2026-03-24T02:18:56Z

Thank you very much for your comments! All comments have been addressed. Please review the new patches (in the same branch).

clsotog · 2026-03-24T02:23:48Z

I have not been able to finish the review but I have a question with this commit
d62e19f NVIDIA: SAUCE: arm_mpam: Add workaround for T241-MPAM-4
At 6.14 we took this patch from morse tree but now it says
backported from 02f5cf363057ceddd099313d1e43636fdcf3d47c dev/dev-main-nvidia-pset-linux-6.19.6
but how canonical will see this backport when doing the review?

fyu1 · 2026-03-24T02:56:30Z

I have not been able to finish the review but I have a question with this commit d62e19f NVIDIA: SAUCE: arm_mpam: Add workaround for T241-MPAM-4 At 6.14 we took this patch from morse tree but now it says backported from 02f5cf363057ceddd099313d1e43636fdcf3d47c dev/dev-main-nvidia-pset-linux-6.19.6 but how canonical will see this backport when doing the review?

Matt told me that the line [backported from ... pset_branch] is only for internal info. External people cannot see pset.

nvmochs · 2026-03-24T03:20:15Z

I have not been able to finish the review but I have a question with this commit d62e19f NVIDIA: SAUCE: arm_mpam: Add workaround for T241-MPAM-4 At 6.14 we took this patch from morse tree but now it says backported from 02f5cf363057ceddd099313d1e43636fdcf3d47c dev/dev-main-nvidia-pset-linux-6.19.6 but how canonical will see this backport when doing the review?

Matt told me that the line [backported from ... pset_branch] is only for internal info. External people cannot see pset.

We just need a way to identify the patch provenance. If the patch is already in a public location, we should pick from there.

@fyu1 I see this patch is on LKML (https://lore.kernel.org/all/20260313144617.3420416-38-ben.horgan@arm.com/) but differs a bit. Is the reason you didn't pick the LKML version due to the base set of MPAM patches we carry in linux-nvidia-6.17 being based on an older revision of the series?

nvmochs · 2026-03-24T03:26:51Z

@fyu1

611616a NVIDIA: VR: SAUCE: arm_mpam: Fix compilation errors

Nit: The change for resctrl_arch_rmid_read() is doing more than what is described in the commit message (changing number of parameters and parameter data types). Is that intended? (it looks like it’s trying to match the prototype but want to double check)

Similarly, mpam_resctrl_monitor_init() has more than just a name change.

fyu1 · 2026-03-24T06:35:48Z

I have not been able to finish the review but I have a question with this commit d62e19f NVIDIA: SAUCE: arm_mpam: Add workaround for T241-MPAM-4 At 6.14 we took this patch from morse tree but now it says backported from 02f5cf363057ceddd099313d1e43636fdcf3d47c dev/dev-main-nvidia-pset-linux-6.19.6 but how canonical will see this backport when doing the review?

Matt told me that the line [backported from ... pset_branch] is only for internal info. External people cannot see pset.

We just need a way to identify the patch provenance. If the patch is already in a public location, we should pick from there.

@fyu1 I see this patch is on LKML (https://lore.kernel.org/all/20260313144617.3420416-38-ben.horgan@arm.com/) but differs a bit. Is the reason you didn't pick the LKML version due to the base set of MPAM patches we carry in linux-nvidia-6.17 being based on an older revision of the series?

Hi, Matt,

They are same patch with minor changes. Need to change to this line to fit to 6.17:

  .iidr       = IIDR_PROD(0x241) | IIDR_VAR(0) | IIDR_REV(0) | IIDR_IMP(0x36b),

Now I backported the T241-MPAM-4 workaround patch from Ben's branch: https://gitlab.arm.com/linux-arm/linux-bh/-/commit/de0a00982d0aefb3d94828e908179aca02feaa85

Please check if the backported patch is good.

BTW, this workaround is only for Grace. Vera doesn't have MBW_MIN feature and doesn't need this workaround to function.

fyu1 · 2026-03-24T06:57:51Z

@fyu1

611616a NVIDIA: VR: SAUCE: arm_mpam: Fix compilation errors

Nit: The change for resctrl_arch_rmid_read() is doing more than what is described in the commit message (changing number of parameters and parameter data types). Is that intended? (it looks like it’s trying to match the prototype but want to double check)

Similarly, mpam_resctrl_monitor_init() has more than just a name change.

Fixed. Add detailed changes in the commit message.

clsotog · 2026-03-24T15:00:50Z

I have not been able to finish the review but I have a question with this commit d62e19f NVIDIA: SAUCE: arm_mpam: Add workaround for T241-MPAM-4 At 6.14 we took this patch from morse tree but now it says backported from 02f5cf363057ceddd099313d1e43636fdcf3d47c dev/dev-main-nvidia-pset-linux-6.19.6 but how canonical will see this backport when doing the review?

Matt told me that the line [backported from ... pset_branch] is only for internal info. External people cannot see pset.

Some of the last patches also reference the pset 6.19 kernel so what we should do with those?

jamieNguyenNVIDIA · 2026-03-24T16:04:13Z

The following commit has two bodies and two sign-offs:
NVIDIA: VR: SAUCE: arm_mpam: Fix compilation errors to adapt to resctrl L3 domain and arch API updates

I believe you intended to remove this part:

Fix the following compilation errors:

1. Commit https://github.com/fyu1/NV-Kernels.fenghuay.baseos/commit/1e9e1305357e9cc033922b8c51217adb27f6d6cb ("x86,fs/resctrl: Rename struct rdt_mon_domain and
   rdt_hw_mon_domain") renames struct rdt_mon_domain to rdt_l3_mon_domain.
   Change the names in MPAM.
2. Implement empty resctrl arch API resctrl_arch_pre_mount(void) to make
   compilation succeed.

Fixes: https://github.com/fyu1/NV-Kernels.fenghuay.baseos/commit/a42549e64ce0aa7f72ec6fb47a8abd5ac6b428b8 ("NVIDIA: SAUCE: arm_mpam: resctrl: Add boilerplate cpuhp and domain allocation")
Fixes: https://github.com/fyu1/NV-Kernels.fenghuay.baseos/commit/ae2a29c5ebb8d3ab1e83319465237f1713083dec ("NVIDIA: SAUCE: arm_mpam: resctrl: Add support for csu counters")
Fixes: https://github.com/fyu1/NV-Kernels.fenghuay.baseos/commit/1cbc0f2c3d5df7425f78060239f0c88925af95cb ("NVIDIA: SAUCE: arm_mpam: resctrl: Add resctrl_arch_config_cntr() for ABMC use")
Fixes: https://github.com/fyu1/NV-Kernels.fenghuay.baseos/commit/dd44394e2b41aff18a54379e4946ccbdc1b4b45e ("NVIDIA: SAUCE: arm_mpam: resctrl: Add resctrl_arch_rmid_read() and resctrl_arch_reset_rmid()")
Fixes: https://github.com/fyu1/NV-Kernels.fenghuay.baseos/commit/842967000721b1495ee4c24d0bcc8333228a8bc3 ("NVIDIA: SAUCE: arm_mpam: resctrl: Add resctrl_arch_cntr_read() & resctrl_arch_reset_cntr()")

Signed-off-by: Fenghua Yu <fenghuay@nvidia.com>

fyu1 · 2026-03-24T16:47:55Z

The following commit has two bodies and two sign-offs: NVIDIA: VR: SAUCE: arm_mpam: Fix compilation errors to adapt to resctrl L3 domain and arch API updates

I believe you intended to remove this part:

Fix the following compilation errors:

1. Commit https://github.com/fyu1/NV-Kernels.fenghuay.baseos/commit/1e9e1305357e9cc033922b8c51217adb27f6d6cb ("x86,fs/resctrl: Rename struct rdt_mon_domain and
   rdt_hw_mon_domain") renames struct rdt_mon_domain to rdt_l3_mon_domain.
   Change the names in MPAM.
2. Implement empty resctrl arch API resctrl_arch_pre_mount(void) to make
   compilation succeed.

Fixes: https://github.com/fyu1/NV-Kernels.fenghuay.baseos/commit/a42549e64ce0aa7f72ec6fb47a8abd5ac6b428b8 ("NVIDIA: SAUCE: arm_mpam: resctrl: Add boilerplate cpuhp and domain allocation")
Fixes: https://github.com/fyu1/NV-Kernels.fenghuay.baseos/commit/ae2a29c5ebb8d3ab1e83319465237f1713083dec ("NVIDIA: SAUCE: arm_mpam: resctrl: Add support for csu counters")
Fixes: https://github.com/fyu1/NV-Kernels.fenghuay.baseos/commit/1cbc0f2c3d5df7425f78060239f0c88925af95cb ("NVIDIA: SAUCE: arm_mpam: resctrl: Add resctrl_arch_config_cntr() for ABMC use")
Fixes: https://github.com/fyu1/NV-Kernels.fenghuay.baseos/commit/dd44394e2b41aff18a54379e4946ccbdc1b4b45e ("NVIDIA: SAUCE: arm_mpam: resctrl: Add resctrl_arch_rmid_read() and resctrl_arch_reset_rmid()")
Fixes: https://github.com/fyu1/NV-Kernels.fenghuay.baseos/commit/842967000721b1495ee4c24d0bcc8333228a8bc3 ("NVIDIA: SAUCE: arm_mpam: resctrl: Add resctrl_arch_cntr_read() & resctrl_arch_reset_cntr()")

Signed-off-by: Fenghua Yu <fenghuay@nvidia.com>

Hi, Jamie, Fixed in the updated branch: 68595a9

fyu1 · 2026-03-24T16:55:23Z

I have not been able to finish the review but I have a question with this commit d62e19f NVIDIA: SAUCE: arm_mpam: Add workaround for T241-MPAM-4 At 6.14 we took this patch from morse tree but now it says backported from 02f5cf363057ceddd099313d1e43636fdcf3d47c dev/dev-main-nvidia-pset-linux-6.19.6 but how canonical will see this backport when doing the review?

Matt told me that the line [backported from ... pset_branch] is only for internal info. External people cannot see pset.

Some of the last patches also reference the pset 6.19 kernel so what we should do with those?

Hi, Matt, since Carol is concerned about the pset branch names, do you want to keep them in the change logs?

nvmochs · 2026-03-24T18:18:45Z

I have not been able to finish the review but I have a question with this commit d62e19f NVIDIA: SAUCE: arm_mpam: Add workaround for T241-MPAM-4 At 6.14 we took this patch from morse tree but now it says backported from 02f5cf363057ceddd099313d1e43636fdcf3d47c dev/dev-main-nvidia-pset-linux-6.19.6 but how canonical will see this backport when doing the review?

Matt told me that the line [backported from ... pset_branch] is only for internal info. External people cannot see pset.

Some of the last patches also reference the pset 6.19 kernel so what we should do with those?

Let's keep the pset branch / SHA references to maintain provenance for our own tracking.

…ming domains The feature to sum event data across multiple domains supports systems with Sub-NUMA Cluster (SNC) mode enabled. The top-level monitoring files in each "mon_L3_XX" directory provide the sum of data across all SNC nodes sharing an L3 cache instance while the "mon_sub_L3_YY" sub-directories provide the event data of the individual nodes. SNC is only associated with the L3 resource and domains and as a result the flow handling the sum of event data implicitly assumes it is working with the L3 resource and domains. Reading of telemetry events does not require to sum event data so this feature can remain dedicated to SNC and keep the implicit assumption of working with the L3 resource and domains. Add a WARN to where the implicit assumption of working with the L3 resource is made and add comments on how the structure controlling the event sum feature is used. Suggested-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Link: https://lore.kernel.org/20251217172121.12030-1-tony.luck@intel.com (cherry picked from commit db64994) Signed-off-by: Shanker Donthineni <sdonthineni@nvidia.com> Signed-off-by: Fenghua Yu <fenghuay@nvidia.com>

Each CPU collects data for telemetry events that it sends to the nearest telemetry event aggregator either when the value of MSR_IA32_PQR_ASSOC.RMID changes, or when a two millisecond timer expires. There is a feature type ("energy" or "perf"), GUID, and MMIO region associated with each aggregator. This combination links to an XML description of the set of telemetry events tracked by the aggregator. XML files are published by Intel in a GitHub repository¹. The telemetry event aggregators maintain per-RMID per-event counts of the total seen for all the CPUs. There may be multiple telemetry event aggregators per package. There are separate sets of aggregators for each feature type. Aggregators in a set may have different GUIDs. All aggregators with the same feature type and GUID are symmetric keeping counts for the same set of events for the CPUs that provide data to them. The XML file for each aggregator provides the following information: 0) Feature type of the events ("perf" or "energy") 1) Which telemetry events are tracked by the aggregator. 2) The order in which the event counters appear for each RMID. 3) The value type of each event counter (integer or fixed-point). 4) The number of RMIDs supported. 5) Which additional aggregator status registers are included. 6) The total size of the MMIO region for an aggregator. Introduce struct event_group that condenses the relevant information from an XML file. Hereafter an "event group" refers to a group of events of a particular feature type (event_group::pfname set to "energy" or "perf") with a particular GUID. Use event_group::pfname to determine the feature id needed to obtain the aggregator details. It will later be used in console messages and with the rdt= boot parameter. The INTEL_PMT_TELEMETRY driver enumerates support for telemetry events. This driver provides intel_pmt_get_regions_by_feature() to list all available telemetry event aggregators of a given feature type. The list includes the "guid", the base address in MMIO space for the region where the event counters are exposed, and the package id where the all the CPUs that report to this aggregator are located. Call INTEL_PMT_TELEMETRY's intel_pmt_get_regions_by_feature() for each event group to obtain a private copy of that event group's aggregator data. Duplicate the aggregator data between event groups that have the same feature type but different GUID. Further processing on this private copy will be unique to the event group. ¹https://github.com/intel/Intel-PMT [ bp: Zap text explaining the code, s/guid/GUID/g ] Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Link: https://lore.kernel.org/20251217172121.12030-1-tony.luck@intel.com (cherry picked from commit 1fb2daa) Signed-off-by: Shanker Donthineni <sdonthineni@nvidia.com> Signed-off-by: Fenghua Yu <fenghuay@nvidia.com>

…GUIDs The telemetry event aggregators of the Intel Clearwater Forest CPU support two RMID-based feature types: "energy" with GUID 0x26696143¹, and "perf" with GUID 0x26557651². The event counter offsets in an aggregator's MMIO space are arranged in groups for each RMID. E.g., the "energy" counters for GUID 0x26696143 are arranged like this: MMIO offset:0x0000 Counter for RMID 0 PMT_EVENT_ENERGY MMIO offset:0x0008 Counter for RMID 0 PMT_EVENT_ACTIVITY MMIO offset:0x0010 Counter for RMID 1 PMT_EVENT_ENERGY MMIO offset:0x0018 Counter for RMID 1 PMT_EVENT_ACTIVITY ... MMIO offset:0x23F0 Counter for RMID 575 PMT_EVENT_ENERGY MMIO offset:0x23F8 Counter for RMID 575 PMT_EVENT_ACTIVITY After all counters there are three status registers that provide indications of how many times an aggregator was unable to process event counts, the time stamp for the most recent loss of data, and the time stamp of the most recent successful update. MMIO offset:0x2400 AGG_DATA_LOSS_COUNT MMIO offset:0x2408 AGG_DATA_LOSS_TIMESTAMP MMIO offset:0x2410 LAST_UPDATE_TIMESTAMP Define event_group structures for both of these aggregator types and define the events tracked by the aggregators in the file system code. PMT_EVENT_ENERGY and PMT_EVENT_ACTIVITY are produced in fixed point format. File system code must output as floating point values. ¹https://github.com/intel/Intel-PMT/blob/main/xml/CWF/OOBMSM/RMID-ENERGY/cwf_aggregator.xml ²https://github.com/intel/Intel-PMT/blob/main/xml/CWF/OOBMSM/RMID-PERF/cwf_aggregator.xml [ bp: Massage commit message. ] Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Link: https://lore.kernel.org/20251217172121.12030-1-tony.luck@intel.com (cherry picked from commit 8f6b6ad) Signed-off-by: Shanker Donthineni <sdonthineni@nvidia.com> Signed-off-by: Fenghua Yu <fenghuay@nvidia.com>

The resctrl file system layer passes the domain, RMID, and event id to the architecture to fetch an event counter. Fetching a telemetry event counter requires additional information that is private to the architecture, for example, the offset into MMIO space from where the counter should be read. Add mon_evt::arch_priv that architecture can use for any private data related to the event. The resctrl filesystem initializes mon_evt::arch_priv when the architecture enables the event and passes it back to architecture when needing to fetch an event counter. Suggested-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Link: https://lore.kernel.org/20251217172121.12030-1-tony.luck@intel.com Signed-off-by: Shanker Donthineni <sdonthineni@nvidia.com> (backported from commit 8ccb1f8) [fenghuay: fix minor conflicts in __check_limbo()] Signed-off-by: Fenghua Yu <fenghuay@nvidia.com>

Every event group has a private copy of the data of all telemetry event aggregators (aka "telemetry regions") tracking its feature type. Included may be regions that have the same feature type but tracking different GUID from the event group's. Traverse the event group's telemetry region data and mark all regions that are not usable by the event group as unusable by clearing those regions' MMIO addresses. A region is considered unusable if: 1) GUID does not match the GUID of the event group. 2) Package ID is invalid. 3) The enumerated size of the MMIO region does not match the expected value from the XML description file. Hereafter any telemetry region with an MMIO address is considered valid for the event group it is associated with. Enable all the event group's events as long as there is at least one usable region from where data for its events can be read. Enabling of an event can fail if the same event has already been enabled as part of another event group. It should never happen that the same event is described by different GUID supported by the same system so just WARN (via resctrl_enable_mon_event()) and skip the event. Note that it is architecturally possible that some telemetry events are only supported by a subset of the packages in the system. It is not expected that systems will ever do this. If they do the user will see event files in resctrl that always return "Unavailable". Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Link: https://lore.kernel.org/20251217172121.12030-1-tony.luck@intel.com (cherry picked from commit 7e6df96) Signed-off-by: Shanker Donthineni <sdonthineni@nvidia.com> Signed-off-by: Fenghua Yu <fenghuay@nvidia.com>

Introduce intel_aet_read_event() to read telemetry events for resource RDT_RESOURCE_PERF_PKG. There may be multiple aggregators tracking each package, so scan all of them and add up all counters. Aggregators may return an invalid data indication if they have received no records for a given RMID. The user will see "Unavailable" if none of the aggregators on a package provide valid counts. Resctrl now uses readq() so depends on X86_64. Update Kconfig. Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Link: https://lore.kernel.org/20251217172121.12030-1-tony.luck@intel.com (cherry picked from commit 51541f6) Signed-off-by: Shanker Donthineni <sdonthineni@nvidia.com> Signed-off-by: Fenghua Yu <fenghuay@nvidia.com>

Population of a monitor group's mon_data directory is unreasonably complicated because of the support for Sub-NUMA Cluster (SNC) mode. Split out the SNC code into a helper function to make it easier to add support for a new telemetry resource. Move all the duplicated code to make and set owner of domain directories into the mon_add_all_files() helper and rename to _mkdir_mondata_subdir(). Suggested-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Link: https://lore.kernel.org/20251217172121.12030-1-tony.luck@intel.com (cherry picked from commit 0ec1db4) Signed-off-by: Shanker Donthineni <sdonthineni@nvidia.com> Signed-off-by: Fenghua Yu <fenghuay@nvidia.com>

Clearing a monitor group's mon_data directory is complicated because of the support for Sub-NUMA Cluster (SNC) mode. Refactor the SNC case into a helper function to make it easier to add support for a new telemetry resource. Suggested-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Link: https://lore.kernel.org/20251217172121.12030-1-tony.luck@intel.com (cherry picked from commit 93d9fd8) Signed-off-by: Shanker Donthineni <sdonthineni@nvidia.com> Signed-off-by: Fenghua Yu <fenghuay@nvidia.com>

…_PKG The L3 resource has several requirements for domains. There are per-domain structures that hold the 64-bit values of counters, and elements to keep track of the overflow and limbo threads. None of these are needed for the PERF_PKG resource. The hardware counters are wide enough that they do not wrap around for decades. Define a new rdt_perf_pkg_mon_domain structure which just consists of the standard rdt_domain_hdr to keep track of domain id and CPU mask. Update resctrl_online_mon_domain() for RDT_RESOURCE_PERF_PKG. The only action needed for this resource is to create and populate domain directories if a domain is added while resctrl is mounted. Similarly resctrl_offline_mon_domain() only needs to remove domain directories. Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Link: https://lore.kernel.org/20251217172121.12030-1-tony.luck@intel.com (cherry picked from commit f4e0cd8) Signed-off-by: Shanker Donthineni <sdonthineni@nvidia.com> Signed-off-by: Fenghua Yu <fenghuay@nvidia.com>

Legacy resctrl features are enumerated by X86_FEATURE_* flags. These may be overridden by quirks to disable features in the case of errata. Users can use kernel command line options to either disable a feature, or to force enable a feature that was disabled by a quirk. A different approach is needed for hardware features that do not have an X86_FEATURE_* flag. Update parsing of the "rdt=" boot parameter to call the telemetry driver directly to handle new "perf" and "energy" options that controls activation of telemetry monitoring of the named type. By itself a "perf" or "energy" option controls the forced enabling or disabling (with ! prefix) of all event groups of the named type. A ":guid" suffix allows for fine grained control per event group. [ bp: s/intel_aet_option/intel_handle_aet_option/g ] Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Link: https://lore.kernel.org/20251217172121.12030-1-tony.luck@intel.com Signed-off-by: Shanker Donthineni <sdonthineni@nvidia.com> (backported from commit 842e7f9) [fenghuay: fix a minor conflict in kernel-parameters.txt doc] Signed-off-by: Fenghua Yu <fenghuay@nvidia.com>

There are now three meanings for "number of RMIDs": 1) The number for legacy features enumerated by CPUID leaf 0xF. This is the maximum number of distinct values that can be loaded into MSR_IA32_PQR_ASSOC. Note that systems with Sub-NUMA Cluster mode enabled will force scaling down the CPUID enumerated value by the number of SNC nodes per L3-cache. 2) The number of registers in MMIO space for each event. This is enumerated in the XML files and is the value initialized into event_group::num_rmid. 3) The number of "hardware counters" (this isn't a strictly accurate description of how things work, but serves as a useful analogy that does describe the limitations) feeding to those MMIO registers. This is enumerated in telemetry_region::num_rmids returned by intel_pmt_get_regions_by_feature(). Event groups with insufficient "hardware counters" to track all RMIDs are difficult for users to use, since the system may reassign "hardware counters" at any time. This means that users cannot reliably collect two consecutive event counts to compute the rate at which events are occurring. Disable such event groups by default. The user may override this with a command line "rdt=" option. In this case limit an under-resourced event group's number of possible monitor resource groups to the lowest number of "hardware counters". Scan all enabled event groups and assign the RDT_RESOURCE_PERF_PKG resource "num_rmid" value to the smallest of these values as this value will be used later to compare against the number of RMIDs supported by other resources to determine how many monitoring resource groups are supported. N.B. Change type of resctrl_mon::num_rmid to u32 to match its usage and the type of event_group::num_rmid so that min(r->num_rmid, e->num_rmid) won't complain about mixing signed and unsigned types. Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Link: https://lore.kernel.org/20251217172121.12030-1-tony.luck@intel.com (cherry picked from commit 67640e3) Signed-off-by: Shanker Donthineni <sdonthineni@nvidia.com> Signed-off-by: Fenghua Yu <fenghuay@nvidia.com>

closid_num_dirty_rmid[] and rmid_ptrs[] are allocated together during resctrl initialization and freed together during resctrl exit. Telemetry events are enumerated on resctrl mount so only at resctrl mount will the number of RMID supported by all monitoring resources and needed as size for rmid_ptrs[] be known. Separate closid_num_dirty_rmid[] and rmid_ptrs[] allocation and free in preparation for rmid_ptrs[] to be allocated on resctrl mount. Keep the rdtgroup_mutex protection around the allocation and free of closid_num_dirty_rmid[] as ARM needs this to guarantee memory ordering. Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Link: https://lore.kernel.org/20251217172121.12030-1-tony.luck@intel.com (cherry picked from commit ee7f6af) Signed-off-by: Shanker Donthineni <sdonthineni@nvidia.com> Signed-off-by: Fenghua Yu <fenghuay@nvidia.com>

resctrl assumes that only the L3 resource supports monitor events, so it simply takes the rdt_resource::num_rmid from RDT_RESOURCE_L3 as the system's number of RMIDs. The addition of telemetry events in a different resource breaks that assumption. Compute the number of available RMIDs as the minimum value across all mon_capable resources (analogous to how the number of CLOSIDs is computed across alloc_capable resources). Note that mount time enumeration of the telemetry resource means that this number can be reduced. If this happens, then some memory will be wasted as the allocations for rdt_l3_mon_domain::mbm_states[] and rdt_l3_mon_domain::rmid_busy_llc created during resctrl initialization will be larger than needed. Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Link: https://lore.kernel.org/20251217172121.12030-1-tony.luck@intel.com (cherry picked from commit 0ecc988) Signed-off-by: Shanker Donthineni <sdonthineni@nvidia.com> Signed-off-by: Fenghua Yu <fenghuay@nvidia.com>

L3 monitor features are enumerated during resctrl initialization and rmid_ptrs[] that tracks all RMIDs and depends on the number of supported RMIDs is allocated during this time. Telemetry monitor features are enumerated during first resctrl mount and may support a different number of RMIDs compared to L3 monitor features. Delay allocation and initialization of rmid_ptrs[] until first mount. Since the number of RMIDs cannot change on later mounts, keep the same set of rmid_ptrs[] until resctrl_exit(). This is required because the limbo handler keeps running after resctrl is unmounted and needs to access rmid_ptrs[] as it keeps tracking busy RMIDs after unmount. Rename routines to match what they now do: dom_data_init() -> setup_rmid_lru_list() dom_data_exit() -> free_rmid_lru_list() Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Link: https://lore.kernel.org/20251217172121.12030-1-tony.luck@intel.com (backported from commit d089164) Signed-off-by: Shanker Donthineni <sdonthineni@nvidia.com> [fenghuay: fix minor conflicts in setup_rmid_lru_list() and dom_data_exit()] Signed-off-by: Fenghua Yu <fenghuay@nvidia.com>

Since telemetry events are enumerated on resctrl mount the RDT_RESOURCE_PERF_PKG resource is not considered "monitoring capable" during early resctrl initialization. This means that the domain list for RDT_RESOURCE_PERF_PKG is not built when the CPU hotplug notifiers are registered and run for the first time right after resctrl initialization. Mark the RDT_RESOURCE_PERF_PKG as "monitoring capable" upon successful telemetry event enumeration to ensure future CPU hotplug events include this resource and initialize its domain list for CPUs that are already online. Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Link: https://lore.kernel.org/20251217172121.12030-1-tony.luck@intel.com (cherry picked from commit 4bbfc90) Signed-off-by: Shanker Donthineni <sdonthineni@nvidia.com> Signed-off-by: Fenghua Yu <fenghuay@nvidia.com>

Update resctrl filesystem documentation with the details about the resctrl files that support telemetry events. [ bp: Drop the debugfs hunk of the documentation until a better debugging solution is found. ] Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Link: https://lore.kernel.org/20251217172121.12030-1-tony.luck@intel.com (cherry picked from commit a8848c4) Signed-off-by: Shanker Donthineni <sdonthineni@nvidia.com> Signed-off-by: Fenghua Yu <fenghuay@nvidia.com>

…rl L3 domain and arch API updates Upstream resctrl renamed the L3 monitor domain type and extended the arch hooks: 1. Use struct rdt_l3_mon_domain in MPAM's resctrl integration, 2. Pass struct rdt_domain_hdr * into resctrl_online_mon_domain() / resctrl_offline_mon_domain(), 3. Match the new resctrl_arch_rmid_read() prototype (header pointer + arch_priv). 4. Update resctrl_arch_cntr_read(), resctrl_arch_reset_rmid(), resctrl_arch_reset_cntr(), and resctrl_arch_config_cntr() to take struct rdt_l3_mon_domain *. 5. Call the new resctrl_enable_mon_event() signature when wiring monitor events and set mon_capable from its return value. 6. Add a no-op resctrl_arch_pre_mount() so MPAM builds with the generic resctrl mount path. Fixes: a42549e ("NVIDIA: SAUCE: arm_mpam: resctrl: Add boilerplate cpuhp and domain allocation") Fixes: ae2a29c ("NVIDIA: SAUCE: arm_mpam: resctrl: Add support for csu counters") Fixes: 1cbc0f2 ("NVIDIA: SAUCE: arm_mpam: resctrl: Add resctrl_arch_config_cntr() for ABMC use") Fixes: dd44394 ("NVIDIA: SAUCE: arm_mpam: resctrl: Add resctrl_arch_rmid_read() and resctrl_arch_reset_rmid()") Fixes: 8429670 ("NVIDIA: SAUCE: arm_mpam: resctrl: Add resctrl_arch_cntr_read() & resctrl_arch_reset_cntr()") Signed-off-by: Fenghua Yu <fenghuay@nvidia.com>

…rors No need to destory MSC instance for the user/admin programming errors sicne it's not causing any functional issues. Signed-off-by: Shanker Donthineni <sdonthineni@nvidia.com> (cherry picked from 316e5833ccb2ef66f50290e48c45b70bf286c8fd dev/dev-main-nvidia-pset-linux-6.19.6) Signed-off-by: Fenghua Yu <fenghuay@nvidia.com>

In a NUMA system, each node may include CPUs, memory, MPAM MSC instances, or any combination thereof. Some high-end servers may have NUMA nodes that include MPAM MSC but no CPUs. In such cases, associate all possible CPUs for those MSCs. Signed-off-by: Shanker Donthineni <sdonthineni@nvidia.com> (cherry picked from f902b5abf39fe10a50b7062dc9ae9d2cfc723248 dev/dev-main-nvidia-pset-linux-6.19.6) Signed-off-by: Fenghua Yu <fenghuay@nvidia.com>

…ring domain setup The current MPAM driver only considers the first component associated with an online/offline CPU during domain creation and teardown. This is insufficient, as CPU-initiated traffic may traverse multiple MSCs before reaching the target, and each MSC must be programmed consistently for proper resource partitioning. Update the MPAM driver to include all components associated with a given CPU during domain setup/teardown to expose expected schemata to userspace for effective resource control. Signed-off-by: Shanker Donthineni <sdonthineni@nvidia.com> (backported from 4309ce9856f87170670c9db40546d9f2fc9dbb86 dev/dev-main-nvidia-pset-linux-6.19.6) [fenghuay: In addition to the core change, this backport includes the following adaptations to bridge the gap between the 24.04 (6.17) MPAM driver and the 6.19.6 base the original was written against: - Add for_each_mpam_resctrl_control() and for_each_mpam_resctrl_mon() iteration macros (from pset c15c066 and 4f42221) - Add MPAM_MAX_EVENT constant to bound the monitor event array - Add traffic_matches_l3() to validate that a memory-class MSC's traffic matches L3 egress topology (from pset ebc0760) Remove redundant if (class->type != MPAM_CLASS_MEMORY) - Replace exposed_alloc_capable/exposed_mon_capable static bools with dynamic resctrl_arch_alloc_capable()/resctrl_arch_mon_capable() that iterate over resources - Change mpam_resctrl_offline_cpu() return type from int to void - Change mpam_resctrl_monitor_init() return type from void to int and propagate errors - Change num_rmid from mpam_pmg_max + 1 to resctrl_arch_system_num_rmid_idx() - Use guard(mutex) for domain_list_lock - Use INIT_LIST_HEAD_RCU for domain lists - Fix not found mba issue on GMEM by only checking traffic_matches_l3() in mpam_resctrl_pick_mba() on class that doesn't have NUMA node] Signed-off-by: Fenghua Yu <fenghuay@nvidia.com>

…onfig Reset an RIS by building a default mpam_config and applying it via mpam_reprogram_ris_partid(), like any other config. - mpam_init_reset_cfg(): set features and default values only for controls supported by the RIS (cpor_part, mbw_part, mbw_max, mbw_prop, cmax_cmax, cmax_cmin). Use full masks for CPBM/MBW_PBM and MPAMCFG_* defaults for MBW_MAX, CMAX, CMIN. - mpam_reprogram_ris_partid(): apply cfg for all supported controls (no separate reset path). Signed-off-by: Shanker Donthineni <sdonthineni@nvidia.com> (backported from c076b208842db87ed50b1c63cff302975a9c8f67 dev/dev-main-nvidia-pset-linux-6.19.6) [fenghuay: Fix porting conflicts and compilaton errors. Remove this sentence in the commit message to avoid confusion because MBW_PROP feature is not supported on Vera/Grace: "Include mpam_feat_mbw_prop when supported so MBW_PROP is written to 0 on reset."] Signed-off-by: Fenghua Yu <fenghuay@nvidia.com>

There is no struct arm_smmu_domain context for domains configured with identity mappings. Use the device to obtain the necessary information to program PARTID and PMGID. Signed-off-by: Shanker Donthineni <sdonthineni@nvidia.com> (backported from e5020b38475ef58c5bb3d1a92028d4e0dd7aff4d dev/dev-main-nvidia-pset-linux-6.19.6) [fenghuay: Koba Ko fixes a typo in iommu_group_get_qos_params(): s/!ops->set_group_qos_params/!ops->get_group_qos_params/] Signed-off-by: Fenghua Yu <fenghuay@nvidia.com>

…n mpam_msmon_read Resolve mpam_feat_msmon_mbwu to the concrete counter type (31/44/63) before mpam_has_feature() and before filling the mon_read arg. This avoids -EOPNOTSUPP when only a specific MBWU feature is set, and ensures _msmon_read() gets the resolved type in arg.type. Fixes: 5b91005 ("NVIDIA: SAUCE: arm_mpam: Use long MBWU counters if supported") Signed-off-by: Fenghua Yu <fenghuay@nvidia.com>

fyu1 · 2026-03-25T22:24:12Z

I fixed a blocking issue on GMEM test failure in the patch "NVIDIA: VR: SAUCE: arm_mpam: Include all associated MSC components during domain setup" and updated its commit message. Here is the fix patch:
diff --git a/drivers/resctrl/mpam_resctrl.c b/drivers/resctrl/mpam_resctrl.c
index f7c2bf8aba99..0accede8cc09 100644
--- a/drivers/resctrl/mpam_resctrl.c
+++ b/drivers/resctrl/mpam_resctrl.c
@@ -1162,7 +1162,9 @@ static void mpam_resctrl_pick_mba(void)
continue;
}

```
  if (!traffic_matches_l3(class)) {
```

  /* Check memory at egress from L3 for MSC with L3 */

  if (!cpumask_equal(&class->affinity, cpu_possible_mask) &&

      !traffic_matches_l3(class)) {
  	pr_debug("class %u traffic doesn't match L3 egress\n",
  		 class->level);
  	continue;

With this fix, I don't see MBA/MBM issue on GMEM test with an engineer built SBIOS enabling GPU MPAM.

If this PR is good for you, please merge it to 6.17 BaseOS.

Thank you very much for your help!

nvmochs · 2026-03-25T22:51:27Z

I fixed a blocking issue on GMEM test failure in the patch "NVIDIA: VR: SAUCE: arm_mpam: Include all associated MSC components during domain setup" and updated its commit message. Here is the fix patch: diff --git a/drivers/resctrl/mpam_resctrl.c b/drivers/resctrl/mpam_resctrl.c index f7c2bf8aba99..0accede8cc09 100644 --- a/drivers/resctrl/mpam_resctrl.c +++ b/drivers/resctrl/mpam_resctrl.c @@ -1162,7 +1162,9 @@ static void mpam_resctrl_pick_mba(void) continue; }
  if (!traffic_matches_l3(class)) {
  /* Check memory at egress from L3 for MSC with L3 */
  if (!cpumask_equal(&class->affinity, cpu_possible_mask) &&
      !traffic_matches_l3(class)) {
  	pr_debug("class %u traffic doesn't match L3 egress\n",
  		 class->level);
  	continue;
With this fix, I don't see MBA/MBM issue on GMEM test with an engineer built SBIOS enabling GPU MPAM.

If this PR is good for you, please merge it to 6.17 BaseOS.

Thank you very much for your help!

Re-reviewed and confirmed this was the only change. No issues with the change.

Acked-by: Matthew R. Ochs <mochs@nvidia.com>

jamieNguyenNVIDIA · 2026-03-25T23:11:40Z

Acked-by: Jamie Nguyen <jamien@nvidia.com>

clsotog · 2026-03-24T14:52:28Z

drivers/resctrl/mpam_resctrl.c

+		return false;
+	}
+
+	cpu = cpumask_any_and(&class->affinity, cpu_online_mask);


Should we put a check cpu >= nr_cpu_ids like in function topology_matches_l3.

Although adding another sanity checking doesn't hurt, without the sanity checking, there won't be any issue because the next statements will check any invalid cpu anyway:
err = find_l3_equivalent_bitmask(cpu, tmp_cpumask);
if (err) {

Great, thanks for looking.

nvmochs · 2026-03-25T23:21:40Z

PR sent to Canonical.

fyu1 requested review from KobaKoNvidia, jamieNguyenNVIDIA, nirmoy and nvmochs March 20, 2026 18:07

fyu1 assigned jamieNguyenNVIDIA, nvmochs and nirmoy Mar 20, 2026

fyu1 mentioned this pull request Mar 20, 2026

MPAM: Pull Request: CPU-less feature, numa id as domain id, performance fix #328

Closed

fyu1 requested a review from clsotog March 20, 2026 18:16

KobaKoNvidia reviewed Mar 23, 2026

View reviewed changes

drivers/resctrl/mpam_devices.c Show resolved Hide resolved

drivers/resctrl/mpam_internal.h Outdated Show resolved Hide resolved

drivers/resctrl/mpam_devices.c Show resolved Hide resolved

jamieNguyenNVIDIA requested changes Mar 23, 2026

View reviewed changes

drivers/resctrl/mpam_devices.c Show resolved Hide resolved

drivers/resctrl/mpam_resctrl.c Show resolved Hide resolved

drivers/resctrl/mpam_devices.c Show resolved Hide resolved

fyu1 force-pushed the 24.04_linux-nvidia-6.17-next.mpam.extras.fixes2 branch 2 times, most recently from 96cc0e5 to 31c434f Compare March 24, 2026 02:03

fyu1 force-pushed the 24.04_linux-nvidia-6.17-next.mpam.extras.fixes2 branch from 31c434f to 7718a84 Compare March 24, 2026 02:53

fyu1 force-pushed the 24.04_linux-nvidia-6.17-next.mpam.extras.fixes2 branch from 7718a84 to 0be0368 Compare March 24, 2026 06:56

fyu1 force-pushed the 24.04_linux-nvidia-6.17-next.mpam.extras.fixes2 branch from 0be0368 to 9dbadcd Compare March 24, 2026 15:06

fyu1 force-pushed the 24.04_linux-nvidia-6.17-next.mpam.extras.fixes2 branch from 9dbadcd to 1091c7f Compare March 24, 2026 16:45

aegl and others added 23 commits March 25, 2026 22:18

fyu1 force-pushed the 24.04_linux-nvidia-6.17-next.mpam.extras.fixes2 branch from cc8ab11 to ecd11fd Compare March 25, 2026 22:19

nvmochs changed the title ~~Please merge MPAM fixes branch: 24.04 linux nvidia 6.17 next.mpam.extras.fixes2~~ [linux-nvidia-6.17] Backport MPAM fixes and support for CPU-less NUMA nodes Mar 25, 2026

clsotog reviewed Mar 25, 2026

View reviewed changes

Conversation

fyu1 commented Mar 20, 2026 • edited by nvmochs Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

nirmoy commented Mar 23, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

fyu1 commented Mar 24, 2026

Uh oh!

clsotog commented Mar 24, 2026

Uh oh!

fyu1 commented Mar 24, 2026

Uh oh!

nvmochs commented Mar 24, 2026

Uh oh!

nvmochs commented Mar 24, 2026

Uh oh!

fyu1 commented Mar 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

fyu1 commented Mar 24, 2026

Uh oh!

clsotog commented Mar 24, 2026

Uh oh!

jamieNguyenNVIDIA commented Mar 24, 2026

Uh oh!

fyu1 commented Mar 24, 2026

Uh oh!

fyu1 commented Mar 24, 2026

Uh oh!

nvmochs commented Mar 24, 2026

Uh oh!

fyu1 commented Mar 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

nvmochs commented Mar 25, 2026

Uh oh!

jamieNguyenNVIDIA commented Mar 25, 2026

Uh oh!

clsotog Mar 24, 2026

Choose a reason for hiding this comment

Uh oh!

fyu1 Mar 26, 2026

Choose a reason for hiding this comment

Uh oh!

clsotog Mar 26, 2026

Choose a reason for hiding this comment

Uh oh!

nvmochs commented Mar 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

8 participants

fyu1 commented Mar 20, 2026 •

edited by nvmochs

Loading

fyu1 commented Mar 24, 2026 •

edited

Loading

fyu1 commented Mar 25, 2026 •

edited

Loading