Add Kata containers docs for 26.3.0 release by a-mccarthy · Pull Request #365 · NVIDIA/cloud-native-docs

a-mccarthy · 2026-03-17T17:25:15Z

No description provided.

Signed-off-by: Abigail McCarthy <20771501+a-mccarthy@users.noreply.github.com>

github-actions · 2026-03-17T17:28:36Z

Documentation preview

https://nvidia.github.io/cloud-native-docs/review/pr-365

a-mccarthy · 2026-03-17T17:29:55Z

gpu-operator/kata-containers-deploy.rst

+* Transparent deployment of unmodified containers.
+
+****************************
+Limitations and Restrictions


@manuelh-dev are these limitations still correct?

Left comments on various of these.

a-mccarthy · 2026-03-17T17:31:07Z

gpu-operator/kata-containers-deploy.rst

+#. Specify at least the following options when you install the Operator.
+   If you want to run Kata Containers by default on all worker nodes, also specify ``--set sandboxWorkloads.defaultWorkload=vm-passthrough``.
+
+   .. code-block:: console


the upstream doc calls out enabling NFD in the install command (and also disabling it in the kata-deploy install). Is that needed? can you elaborate on why users should include those?

@jojimt - can you help here? see https://github.com/kata-containers/kata-containers/pull/12651/changes on what we currently suggest in the Kata docs

a-mccarthy · 2026-03-17T17:31:49Z

gpu-operator/kata-containers-deploy.rst

+
+
+*********************
+Run a Sample Workload


Are there any updates needed for the sample app?

Left relevant comments further in line

Signed-off-by: Abigail McCarthy <20771501+a-mccarthy@users.noreply.github.com>

a-mccarthy · 2026-03-18T17:53:09Z

gpu-operator/kata-containers-deploy.rst

+* The ``kata-qemu-nvidia-gpu`` runtime class is used with Kata Containers.
+
+* The ``kata-qemu-nvidia-gpu-snp`` runtime class is used with Confidential Containers and is installed by default even though it is not used with this configuration.
+


@manuelh-dev When you install kata-deploy, there are more nvidia runtimes listed and just these 2. and your doc calles out kata-qemu-nvidia-gpu-tdx--Should we include anything about this runtime? what does it to?

I think, for the Kata doc here, let's just emphasize that it deploys the kata-qemu-nvidia-gpu runtime class and not much talk about other runtime classes?

We should add that if the nvidia-cc-manager pod is running after you deployed, you may need to change the mode to sandbox workloads only (in this case you have deployed the solution on CC capable hardware). We can then reference to the sibling Coco doc which already explains how to mode-switch!?

Signed-off-by: Abigail McCarthy <20771501+a-mccarthy@users.noreply.github.com>

manuelh-dev · 2026-03-18T20:47:10Z

confidential-containers/overview.rst

 NVIDIA's approach to the Confidential Containers architecture delivers on the key promise of Confidential Computing: confidentiality, integrity, and verifiability.
 Integrating open source and NVIDIA software components with the Confidential Computing capabilities of NVIDIA GPUs, the Reference Architecture for Confidential Containers is designed to be the secure and trusted deployment model for AI workloads.

 .. image:: graphics/CoCo-Reference-Architecture.png


Minor concern: Do we ever really reference the two illustrations? At least in "Software Components for Confidential Containers" we could refer back to the illustration.

The illustrations are somewhat 'dangling' and not well-explained. Maybe this can be fixed by moving them to where we actually reference these (maybe I was not reading well enough)?

manuelh-dev · 2026-03-18T20:48:46Z

confidential-containers/overview.rst


  http://www.apache.org/licenses/LICENSE-2.0

  Unless required by applicable law or agreed to in writing, software


May be good to rebase against 3709f11 and solve potential conflicts

manuelh-dev · 2026-03-18T20:50:40Z

confidential-containers/overview.rst


 Following is the platform and feature support scope for Early Access (EA) of Confidential Containers open Reference Architecture published by NVIDIA.

 .. flat-table:: Supported Platforms


Is this not duplication of above?

manuelh-dev · 2026-03-18T20:52:10Z

confidential-containers/overview.rst

        | - NVIDIA Confidential Computing Manager for Kubernetes
-        | - NVIDIA Kata Manager for Kubernetes
     - v25.10.0 and higher 
   * - CoCo release (EA) 


Is this intentional? I think for our latest stack we need a Kata 3.28 release.
I don't know what 'v0.18.0' is here and I ma not sure if we have the exact trustee/guest components in these versions. We are not using a concrete CoCo release. We are using a Kata release and this Kata release pull in CoCo components as dependencies

manuelh-dev · 2026-03-18T20:52:27Z

confidential-containers/overview.rst

 Limitations and Restrictions for CoCo EA
 ---------------------------------------- 

 * Only the AMD platform using SEV-SNP is supported for Confidential Containers Early Access.  


No longer the case with latest bits

manuelh-dev · 2026-03-18T20:53:32Z

confidential-containers/overview.rst

 ---------------------------------------- 

 * Only the AMD platform using SEV-SNP is supported for Confidential Containers Early Access.  
 * GPUs are available to containers as a single GPU in passthrough mode only. Multi-GPU passthrough and vGPU are not supported.  


See https://github.com/manuelh-dev/kata-containers/blob/mahuber/doc-update-nvidia-gpu-op/docs/use-cases/NVIDIA-GPU-passthrough-and-Kata-QEMU.md:
The currently supported modes for enabling GPU workloads in the TEE scenario are: (1) single‑GPU passthrough (one physical GPU per pod) and (2) multi‑GPU passthrough on NVSwitch (NVLink) based HGX systems (for example, HGX Hopper (SXM) and HGX Blackwell / HGX B200).

manuelh-dev · 2026-03-18T20:54:34Z

confidential-containers/overview.rst


 * Only the AMD platform using SEV-SNP is supported for Confidential Containers Early Access.  
 * GPUs are available to containers as a single GPU in passthrough mode only. Multi-GPU passthrough and vGPU are not supported.  
 * Support is limited to initial installation and configuration only. Upgrade and configuration of existing clusters to configure confidential computing is not supported.  


No longer the case. We have a new component, the kata lifecycle manager https://github.com/kata-containers/lifecycle-manager - this component is currently in tag v0.1.2 but will still need to be incremented

manuelh-dev · 2026-03-18T20:54:58Z

confidential-containers/overview.rst

 * Only the AMD platform using SEV-SNP is supported for Confidential Containers Early Access.  
 * GPUs are available to containers as a single GPU in passthrough mode only. Multi-GPU passthrough and vGPU are not supported.  
 * Support is limited to initial installation and configuration only. Upgrade and configuration of existing clusters to configure confidential computing is not supported.  
 * Support for confidential computing environments is limited to the implementation described on this page.  


I don't really know what this means. Does this mean we only support Intel and AMD?

manuelh-dev · 2026-03-18T20:55:16Z

confidential-containers/overview.rst

 * Support is limited to initial installation and configuration only. Upgrade and configuration of existing clusters to configure confidential computing is not supported.  
 * Support for confidential computing environments is limited to the implementation described on this page.  
 * NVIDIA supports the GPU Operator and confidential computing with the containerd runtime only.  
 * NFD doesn't label all Confidential Container capable nodes as such automatically. In some cases, users must manually label nodes to deploy the NVIDIA Confidential Computing Manager for Kubernetes operand onto these nodes as described in the deployment guide.


This is no longer the case

manuelh-dev · 2026-03-18T21:05:27Z

gpu-operator/confidential-containers-deploy.rst

  * Run ``sudo update-grub`` after making the change to configure the bootloader. Reboot the host after configuring the bootloader.

 * You have a Kubernetes cluster and you have cluster administrator privileges.
  * For this cluster, you are using containerd 2.1 and Kubernetes version v1.34. These versions have been validated with the kata-containers project and are recommended. You use a ``runtimeRequestTimeout`` of more than 5 minutes in your `kubelet configuration <https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/>`_ (the current method to pull container images within the confidential container may exceed the two minute default timeout in case of using large container images).


This is not rendered properly. Need to add an empty line

manuelh-dev · 2026-03-18T21:06:59Z

gpu-operator/confidential-containers-deploy.rst

  * Run ``sudo update-grub`` after making the change to configure the bootloader. Reboot the host after configuring the bootloader.

 * You have a Kubernetes cluster and you have cluster administrator privileges.
  * For this cluster, you are using containerd 2.1 and Kubernetes version v1.34. These versions have been validated with the kata-containers project and are recommended. You use a ``runtimeRequestTimeout`` of more than 5 minutes in your `kubelet configuration <https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/>`_ (the current method to pull container images within the confidential container may exceed the two minute default timeout in case of using large container images).


Let's check on containerd and Kubernetes version based on what we author in overview.rst. There may be discrepancies

manuelh-dev · 2026-03-18T21:07:19Z

gpu-operator/confidential-containers-deploy.rst


   This step ensures that you can continue to run traditional container workloads with GPU or vGPU workloads on some nodes in your cluster. Alternatively, you can set a default sandbox workload parameter to vm-passthrough to run confidential containers on all worker nodes when you install the GPU Operator.

 2. Install the latest Kata Containers helm chart (minimum version: 3.24.0).


manuelh-dev · 2026-03-18T21:07:36Z

gpu-operator/confidential-containers-deploy.rst


   This step installs all required components from the Kata Containers project including the Kata Containers runtime binary, runtime configuration, UVM kernel and initrd that NVIDIA uses for confidential containers and native Kata containers.

 3. Install the latest version of the NVIDIA GPU Operator (minimum version: v25.10.0).


manuelh-dev · 2026-03-18T21:07:49Z

gpu-operator/confidential-containers-deploy.rst


      $ kubectl label node <node-name> nvidia.com/gpu.workload.config=vm-passthrough

 2. Use the 3.24.0 Kata Containers version and chart in environment variables::


manuelh-dev · 2026-03-18T21:08:18Z

gpu-operator/confidential-containers-deploy.rst

          --create-namespace \
          -f "https://raw.githubusercontent.com/kata-containers/kata-containers/refs/tags/${VERSION}/tools/packaging/kata-deploy/helm-chart/kata-deploy/try-kata-nvidia-gpu.values.yaml" \
          --set nfd.enabled=false \
          --set shims.qemu-nvidia-gpu-tdx.enabled=false \


See https://github.com/kata-containers/kata-containers/pull/12651/changes, this is updated. Let's use the command from there

manuelh-dev · 2026-03-18T21:08:29Z

gpu-operator/confidential-containers-deploy.rst


   *Example Output*::

      Pulled: ghcr.io/kata-containers/kata-deploy-charts/kata-deploy:3.24.0


Need to change that as well.

manuelh-dev · 2026-03-18T21:08:49Z

gpu-operator/confidential-containers-deploy.rst


 5. Verify that the kata-qemu-nvidia-gpu and kata-qemu-nvidia-gpu-snp runtime classes are available::

      $ kubectl get runtimeclass


-tdx should now also be present

manuelh-dev · 2026-03-18T21:09:51Z

gpu-operator/confidential-containers-deploy.rst

      REVISION: 1
      TEST SUITE: None

   Note that, for heterogeneous clusters with different GPU types, you can omit


This note is being changed in: https://github.com/kata-containers/kata-containers/pull/12651/changes

manuelh-dev · 2026-03-18T21:10:17Z

gpu-operator/confidential-containers-deploy.rst

      nvidia-sandbox-validator-6xnzc                                    1/1     Running   1          30s
      nvidia-vfio-manager-h229x                                         1/1     Running   0          62s

 4. If the nvidia-cc-manager is *not* running, you need to label your CC-capable node(s) by hand. The node labelling capabilities in the early access version are not complete. To label your node(s), run::


This can be removed. Node capability detection works now

manuelh-dev · 2026-03-18T21:11:20Z

gpu-operator/confidential-containers-deploy.rst

                 Kernel driver in use: vfio-pci
                 Kernel modules: nvidiafb, nouveau

   b. Confirm that the kata-deploy functionality installed the kata-qemu-nvidia-gpu-snp and kata-qemu-nvidia-gpu runtime class files::


-tdx as well. Maybe we should say "installed the relevant runtime classes":

Note the double "::" - not quite consistent with all other bullet points.

manuelh-dev · 2026-03-18T21:11:35Z

gpu-operator/confidential-containers-deploy.rst


         $ ls -l /opt/kata/share/defaults/kata-containers/ | grep nvidia

      *Example Output*::


Suggest to refresh. This looks odd in terms of file sizes and dates

manuelh-dev · 2026-03-18T21:12:54Z

gpu-operator/confidential-containers-deploy.rst


   c. Confirm that the kata-deploy functionality installed the UVM components::

         $ ls -l /opt/kata/share/kata-containers/ | grep nvidia


Let's refresh this as well

manuelh-dev · 2026-03-18T21:13:46Z

gpu-operator/confidential-containers-deploy.rst


 1. Create a file, such as the following cuda-vectoradd-kata.yaml sample, specifying the kata-qemu-nvidia-gpu-snp runtime class:

   .. code-block:: yaml


Please take a look at: https://github.com/kata-containers/kata-containers/pull/12651/changes - the flow has slightly changed with 'echo'

manuelh-dev · 2026-03-18T21:14:39Z

gpu-operator/confidential-containers-deploy.rst

 Managing the Confidential Computing Mode
 =========================================

 You can set the default confidential computing mode of the NVIDIA GPUs by setting the ``ccManager.defaultMode=<on|off|devtools>`` option. The default value is off. You can set this option when you install NVIDIA GPU Operator or afterward by modifying the cluster-policy instance of the ClusterPolicy object.


I have now learnt that there is a 'pcie' mode too, to support multi-gpu passthrough. We should highlight that scenario too. https://github.com/kata-containers/kata-containers/pull/12651/changes is lacking to describe that properly as well at the moment

manuelh-dev · 2026-03-18T21:15:59Z

gpu-operator/confidential-containers-deploy.rst


 Now, the guest can be used with attestation. For more information on how to provision Trustee with resources and policies, refer to the upstream documentation.

 During attestation, the GPU will be set to ready. As such, when running a workload that does attestation, it is not necessary to set the nvrc.smi.srs=1 kernel parameter.


This and next line, best to use nvrc.smi.srs=1 and RUST_LOG=debug

manuelh-dev · 2026-03-18T21:16:45Z

confidential-containers/overview.rst

@@ -94,7 +94,6 @@
 * NVIDIA Confidential Computing Manager (cc-manager) for Kubernetes - to set the confidential computing (CC) mode on the NVIDIA GPUs.
 * NVIDIA Sandbox Device Plugin - to discover NVIDIA GPUs along with their capabilities, to advertise these to Kubernetes, and to allocate GPUs during pod deployment.
 * NVIDIA VFIO Manager - to bind discovered NVIDIA GPUs to the vfio-pci driver for VFIO passthrough.


see https://github.com/kata-containers/kata-containers/pull/12651/changes:
nvidia-vfio-manager: Binding discovered NVIDIA GPUs and nvswitches to
the vfio-pci driver for VFIO passthrough.

manuelh-dev · 2026-03-18T21:18:23Z

confidential-containers/overview.rst

@@ -94,7 +94,6 @@
 * NVIDIA Confidential Computing Manager (cc-manager) for Kubernetes - to set the confidential computing (CC) mode on the NVIDIA GPUs.
 * NVIDIA Sandbox Device Plugin - to discover NVIDIA GPUs along with their capabilities, to advertise these to Kubernetes, and to allocate GPUs during pod deployment.


https://github.com/kata-containers/kata-containers/pull/12651/changes has an updated version on the description of the sandbox device plugin

manuelh-dev · 2026-03-18T21:19:35Z

gpu-operator/confidential-containers-deploy.rst

 The page describes deploying Confidential Containers with the NVIDIA GPU Operator.
 The implementation relies on the Kata Containers project to provide the lightweight utility Virtual Machines (UVMs) that feel and perform like containers but provide strong workload isolation.

 Refer to the `Confidential Containers overview <https://docs.nvidia.com/datacenter/cloud-native/confidential-containers/latest/overview.html>`_ for details on the reference architecture and supported platforms.


General remark: let's make sure we mention SNP and TDX in parity. I think I caught all prior occurrences where we only talked about SNP, but good to double check in general

manuelh-dev · 2026-03-18T21:20:29Z

confidential-containers/overview.rst

  http://www.apache.org/licenses/LICENSE-2.0

  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an "AS IS" BASIS,


General comment: https://github.com/kata-containers/kata-containers/pull/12651/changes has a paragraph/section called "feature set" - to discuss whether we want that here as well, or in the other deployment instructions?

manuelh-dev · 2026-03-18T21:21:40Z

gpu-operator/confidential-containers-deploy.rst


  * Run ``sudo update-grub`` after making the change to configure the bootloader. Reboot the host after configuring the bootloader.

 * You have a Kubernetes cluster and you have cluster administrator privileges.


Should we note that the NVIDIA shim comes with a 20min timeout and that clusters used for NVIDIA testing in CI also use a 20 min timeout to pull very large container images?

manuelh-dev · 2026-03-18T21:26:30Z

gpu-operator/kata-containers-deploy.rst

@@ -0,0 +1,507 @@
+.. license-header
+  SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved.


manuelh-dev · 2026-03-18T21:27:43Z

gpu-operator/kata-containers-deploy.rst

+   Treat our operands as proper nouns and use title case.
+
+#################################
+GPU Operator with Kata Containers


I like the start from the CoCo page better: Deploy Confidential Containers with NVIDIA GPU Operator

In our case Deploy Kata Containers

Let's try to align(?)

manuelh-dev · 2026-03-18T21:29:04Z

gpu-operator/kata-containers-deploy.rst

+Hardware virtualization and a separate kernel provide improved workload isolation
+in comparison with traditional containers.
+
+The NVIDIA GPU Operator works with the Kata container runtime.


Should discuss internally how we want to position this. At least for the CoCo docs we wanted to make the GPU operator a bit less prevalent

manuelh-dev · 2026-03-18T21:32:23Z

gpu-operator/kata-containers-deploy.rst

+Limitations and Restrictions
+****************************
+
+* GPUs are available to containers as a single GPU in passthrough mode only.


This will need change. To follow up offline about this. For CoCo multi-gpu passthrough is partially supported. Let's talk

manuelh-dev · 2026-03-18T21:32:49Z

gpu-operator/kata-containers-deploy.rst

+* GPUs are available to containers as a single GPU in passthrough mode only.
+  Multi-GPU passthrough and vGPU are not supported.
+
+* Support is limited to initial installation and configuration only.


See my comment in the CoCo doc about the kata lifecycle manager component :)

manuelh-dev · 2026-03-18T21:33:27Z

gpu-operator/kata-containers-deploy.rst

+* Support is limited to initial installation and configuration only.
+  Upgrade and configuration of existing clusters for Kata Containers is not supported.
+
+* Support for Kata Containers is limited to the implementation described on this page.


Question on the support for Red Hat OpenShift - do we need/have that on the CoCo deploy page as well?

manuelh-dev · 2026-03-18T21:34:01Z

gpu-operator/kata-containers-deploy.rst

+* Support for Kata Containers is limited to the implementation described on this page.
+  The Operator does not support Red Hat OpenShift sandbox containers.
+
+* Uninstalling the GPU Operator does not remove the files


Since we use kata-deploy to deploy the kata bits we can remove this. This is also not appearing in the CoCo sibling doc

manuelh-dev · 2026-03-18T21:34:58Z

gpu-operator/kata-containers-deploy.rst

+       * ``Node Feature Discovery`` -- to detect CPU, kernel, and host features and label worker nodes.
+       * ``NVIDIA GPU Feature Discovery`` -- to detect NVIDIA GPUs and label worker nodes.
+     - * ``NVIDIA Sandbox Device Plugin`` -- to discover and advertise the passthrough GPUs to kubelet.
+       * ``NVIDIA Confidential Computing Manager for Kubernetes`` -- to set the confidential computing (CC) mode on the NVIDIA GPUs.


A big IF here: Appears only if you deploy on CC hardware. We should clarify here as well.

manuelh-dev · 2026-03-18T21:35:32Z

gpu-operator/kata-containers-deploy.rst

+
+* You have a Kubernetes cluster and you have cluster administrator privileges.
+
+* It is recommended that you configure your Kubelet with a higher ``runtimeRequestTimeout`` timeout value than the two minute default timeout. 


For vanilla Kata we don't use guest pull, so we don't need this.

manuelh-dev · 2026-03-18T21:36:10Z

gpu-operator/kata-containers-deploy.rst

+===================
+
+Install the kata-deploy Helm chart. 
+Minimum required version is 3.24.0.


manuelh-dev · 2026-03-18T21:36:49Z

gpu-operator/kata-containers-deploy.rst

+
+   .. code-block:: console
+
+     $ helm install kata-deploy "${CHART}" \


I think we'll want the same as in https://github.com/kata-containers/kata-containers/pull/12651/changes

manuelh-dev · 2026-03-18T21:37:18Z

gpu-operator/kata-containers-deploy.rst

+
+   .. code-block:: console
+
+      $ kubectl get runtimeclass


I'd just do kubectl get runtimeclass kata-qemu-nvidia-gpu and only list that single runtime class

manuelh-dev · 2026-03-18T21:39:02Z

gpu-operator/kata-containers-deploy.rst

+      TEST SUITE: None
+
+
+ For heterogeneous clusters with different GPU types, you can specify an empty `P_GPU_ALIAS` environment variable for the sandbox device plugin, ``--set 'sandboxDevicePlugin.env[0].name=P_GPU_ALIAS'`` and ``--set 'sandboxDevicePlugin.env[0].value=""'``.


We will want a version like in https://github.com/kata-containers/kata-containers/pull/12651/changes - to confirm if/whether we will want exactly the same not for Coco and Kata. To discuss

manuelh-dev · 2026-03-18T21:39:30Z

gpu-operator/kata-containers-deploy.rst

+      gpu-operator-node-feature-discovery-gc-775976dc9d-742cw       1/1     Running             0          23m
+      gpu-operator-node-feature-discovery-master-6c86bc9c69-v2vvf   1/1     Running             0          23m
+      gpu-operator-node-feature-discovery-worker-jhr6m              1/1     Running             0          23m
+      nvidia-cc-manager-4d5xl                                       1/1     Running             0          19m


needs update. may want cc-manager to be out of scope. also, no cuda-validator to my knowledge

manuelh-dev · 2026-03-18T21:39:55Z

gpu-operator/kata-containers-deploy.rst

+      nvidia-sandbox-validator-w9bdg                                1/1     Running             0          19m
+      nvidia-vfio-manager-5phzl                                     1/1     Running             0          19m
+
+#. Verify that the ``kata-qemu-nvidia-gpu`` and ``kata-qemu-nvidia-gpu-snp`` runtime classes are available:


Again, probably better to not talk about the -snp runtime class, let's just explicitly get the kata-qemu-nvidia-gpu runtime class

manuelh-dev · 2026-03-18T21:40:28Z

gpu-operator/kata-containers-deploy.rst

+
+* Specify a passthrough GPU resource.
+
+#. Determine the passthrough GPU resource names:


Default alias is pgpu, so not needed

manuelh-dev · 2026-03-18T21:40:38Z

gpu-operator/kata-containers-deploy.rst

+      metadata:
+        name: cuda-vectoradd-kata
+        annotations:
+          cdi.k8s.io/gpu: "nvidia.com/pgpu=0"


annotation not needed

manuelh-dev · 2026-03-18T21:40:45Z

gpu-operator/kata-containers-deploy.rst

+        name: cuda-vectoradd-kata
+        annotations:
+          cdi.k8s.io/gpu: "nvidia.com/pgpu=0"
+          io.katacontainers.config.hypervisor.default_memory: "16384"


manuelh-dev · 2026-03-18T21:40:57Z

gpu-operator/kata-containers-deploy.rst

+        restartPolicy: OnFailure
+        containers:
+        - name: cuda-vectoradd
+          image: "nvcr.io/nvidia/k8s/cuda-sample:vectoradd-cuda11.7.1-ubuntu20.04"


let's update the image, same as for coco

manuelh-dev · 2026-03-18T21:41:21Z

gpu-operator/kata-containers-deploy.rst

+          image: "nvcr.io/nvidia/k8s/cuda-sample:vectoradd-cuda11.7.1-ubuntu20.04"
+          resources:
+            limits:
+              "nvidia.com/GA102GL_A10": 1


nvidia.com/pgpu - pretty much the same manifest as on the coco page, jsut a different runtime class and no annotations are required

manuelh-dev · 2026-03-18T21:41:44Z

gpu-operator/kata-containers-deploy.rst

+
+
+************************
+About the Pod Annotation


No longer needed :)

Add Kata containers docs for 26.3.0 release

87b73e4

Signed-off-by: Abigail McCarthy <20771501+a-mccarthy@users.noreply.github.com>

a-mccarthy commented Mar 17, 2026

View reviewed changes

cdesiniotis mentioned this pull request Mar 18, 2026

Add docs for 26.3.0 release #353

Open

Update kata docs

0ee0861

Signed-off-by: Abigail McCarthy <20771501+a-mccarthy@users.noreply.github.com>

a-mccarthy commented Mar 18, 2026

View reviewed changes

Update kata and coco page

2a0e64b

Signed-off-by: Abigail McCarthy <20771501+a-mccarthy@users.noreply.github.com>

manuelh-dev reviewed Mar 18, 2026

View reviewed changes

gpu-operator/kata-containers-deploy.rst

************************

About the Pod Annotation

Copy link

Contributor

manuelh-dev Mar 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No longer needed :)

		* The ``kata-qemu-nvidia-gpu`` runtime class is used with Kata Containers.

		* The ``kata-qemu-nvidia-gpu-snp`` runtime class is used with Confidential Containers and is installed by default even though it is not used with this configuration.


		http://www.apache.org/licenses/LICENSE-2.0

		Unless required by applicable law or agreed to in writing, software


		Following is the platform and feature support scope for Early Access (EA) of Confidential Containers open Reference Architecture published by NVIDIA.

		.. flat-table:: Supported Platforms


		This step ensures that you can continue to run traditional container workloads with GPU or vGPU workloads on some nodes in your cluster. Alternatively, you can set a default sandbox workload parameter to vm-passthrough to run confidential containers on all worker nodes when you install the GPU Operator.

		2. Install the latest Kata Containers helm chart (minimum version: 3.24.0).


		This step installs all required components from the Kata Containers project including the Kata Containers runtime binary, runtime configuration, UVM kernel and initrd that NVIDIA uses for confidential containers and native Kata containers.

		3. Install the latest version of the NVIDIA GPU Operator (minimum version: v25.10.0).


		$ kubectl label node <node-name> nvidia.com/gpu.workload.config=vm-passthrough

		2. Use the 3.24.0 Kata Containers version and chart in environment variables::


		Example Output::

		Pulled: ghcr.io/kata-containers/kata-deploy-charts/kata-deploy:3.24.0


		5. Verify that the kata-qemu-nvidia-gpu and kata-qemu-nvidia-gpu-snp runtime classes are available::

		$ kubectl get runtimeclass


		$ ls -l /opt/kata/share/defaults/kata-containers/ \| grep nvidia

		Example Output::


		c. Confirm that the kata-deploy functionality installed the UVM components::

		$ ls -l /opt/kata/share/kata-containers/ \| grep nvidia


		1. Create a file, such as the following cuda-vectoradd-kata.yaml sample, specifying the kata-qemu-nvidia-gpu-snp runtime class:

		.. code-block:: yaml


		Now, the guest can be used with attestation. For more information on how to provision Trustee with resources and policies, refer to the upstream documentation.

		During attestation, the GPU will be set to ready. As such, when running a workload that does attestation, it is not necessary to set the nvrc.smi.srs=1 kernel parameter.

		@@ -94,7 +94,6 @@
		* NVIDIA Confidential Computing Manager (cc-manager) for Kubernetes - to set the confidential computing (CC) mode on the NVIDIA GPUs.
		* NVIDIA Sandbox Device Plugin - to discover NVIDIA GPUs along with their capabilities, to advertise these to Kubernetes, and to allocate GPUs during pod deployment.


		* Run ``sudo update-grub`` after making the change to configure the bootloader. Reboot the host after configuring the bootloader.

		* You have a Kubernetes cluster and you have cluster administrator privileges.

		@@ -0,0 +1,507 @@
		.. license-header
		SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved.


		* You have a Kubernetes cluster and you have cluster administrator privileges.

		* It is recommended that you configure your Kubelet with a higher ``runtimeRequestTimeout`` timeout value than the two minute default timeout.


		.. code-block:: console

		$ helm install kata-deploy "${CHART}" \

		TEST SUITE: None


		For heterogeneous clusters with different GPU types, you can specify an empty `P_GPU_ALIAS` environment variable for the sandbox device plugin, ``--set 'sandboxDevicePlugin.env[0].name=P_GPU_ALIAS'`` and ``--set 'sandboxDevicePlugin.env[0].value=""'``.


		* Specify a passthrough GPU resource.

		#. Determine the passthrough GPU resource names:

Conversation

a-mccarthy commented Mar 17, 2026

Uh oh!

github-actions bot commented Mar 17, 2026

Documentation preview

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

a-mccarthy Mar 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

manuelh-dev Mar 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

a-mccarthy Mar 17, 2026 •

edited

Loading

manuelh-dev Mar 18, 2026 •

edited

Loading