K8s namespace not ready

### Agent Diagnostic

No agent available — NemoClaw gateway failed to start, preventing agent setup.
Could not load any skills. This is the issue being reported.

### Description

The OpenShell gateway fails to start on WSL2 + Docker Desktop.
Expected: gateway starts successfully and openshell namespace is created.
Actual: times out waiting for namespace 'openshell' to exist.

### Reproduction Steps

 1. Install NemoClaw on Windows WSL2 (Ubuntu) with Docker Desktop
2. Run: curl -fsSL https://www.nvidia.com/nemoclaw.sh | sudo bash
3. Gateway fails during onboarding
4. Tried: openshell gateway destroy --name nemoclaw && openshell gateway start --name nemoclaw
5. Same error every time

### Environment

OS: Windows 11, WSL2 Ubuntu 24
Docker: 29.2.1
OpenShell: 0.0.14
NemoClaw: 0.1.0
Node.js: v22.22.1
RAM: ~5.6GB available to Docker

### Logs

```shell
Error:   × K8s namespace not ready
  ╰─▶ timed out waiting for namespace 'openshell' to exist: Error from server (NotFound): namespaces "openshell" not
      found

      container logs:
        I0323 19:14:54.082629      95 iptables.go:212] Changing default FORWARD chain policy to ACCEPT
        I0323 19:14:54.108257      95 iptables.go:358] bootstrap done
        time="2026-03-23T19:14:54Z" level=info msg="Wrote flannel subnet file to /run/flannel/subnet.env"
        time="2026-03-23T19:14:54Z" level=info msg="Running flannel backend."
        I0323 19:14:54.124237      95 vxlan_network.go:68] watching for new subnet leases
        I0323 19:14:54.124279      95 vxlan_network.go:115] starting vxlan device watcher
        I0323 19:14:54.124864      95 iptables.go:358] bootstrap done
        time="2026-03-23T19:14:55Z" level=info msg="Starting network policy controller version v2.6.3-k3s1, built on
      2026-03-04T22:29:48Z, go1.25.7"
        I0323 19:14:55.153686      95 network_policy_controller.go:164] Starting network policy controller
        I0323 19:14:55.238143      95 network_policy_controller.go:179] Starting network policy controller full sync
      goroutine
        time="2026-03-23T19:14:55Z" level=info msg="Started tunnel to 172.18.0.2:6443"
        time="2026-03-23T19:14:55Z" level=info msg="Stopped tunnel to 127.0.0.1:6443"
        time="2026-03-23T19:14:55Z" level=info msg="Connecting to proxy" url="wss://172.18.0.2:6443/v1-k3s/connect"
        time="2026-03-23T19:14:55Z" level=info msg="Proxy done" err="context canceled" url="wss://127.0.0.1:6443/v1-
      k3s/connect"
        time="2026-03-23T19:14:55Z" level=info msg="error in remotedialer server [400]: websocket: close 1006
      (abnormal closure): unexpected EOF"
        time="2026-03-23T19:14:55Z" level=info msg="Handling backend connection request [fd891fc6b0cd]"
        time="2026-03-23T19:14:55Z" level=info msg="Connected to proxy" url="wss://172.18.0.2:6443/v1-k3s/connect"
        time="2026-03-23T19:14:55Z" level=info msg="Remotedialer connected to proxy" url="wss://172.18.0.2:6443/v1-
      k3s/connect"
        E0323 19:15:21.848355      95 resource_quota_controller.go:460] "Error during resource discovery" err="unable
      to retrieve the complete list of server APIs: metrics.k8s.io/v1beta1: stale GroupVersion discovery:
      metrics.k8s.io/v1beta1"
        I0323 19:15:21.891818      95 garbagecollector.go:792] "failed to discover some groups"
      groups="map[\"metrics.k8s.io/v1beta1\":\"stale GroupVersion discovery: metrics.k8s.io/v1beta1\"]"
        E0323 19:15:43.168368      95 handler_proxy.go:143] error resolving kube-system/metrics-server: no endpoints
      available for service "metrics-server"
        I0323 19:15:51.128839      95 pod_startup_latency_tracker.go:108] "Observed pod startup duration" pod="agent-
      sandbox-system/agent-sandbox-controller-0" podStartSLOduration=26.392285921 podStartE2EDuration="1m2.128822219s"
      podCreationTimestamp="2026-03-23 19:14:49 +0000 UTC" firstStartedPulling="2026-03-23 19:15:08.916140434 +0000
      UTC m=+31.547774233" lastFinishedPulling="2026-03-23 19:15:50.0588984 +0000 UTC m=+67.284310531"
      observedRunningTime="2026-03-23 19:15:51.128746023 +0000 UTC m=+68.354158154" watchObservedRunningTime="2026-03-
      23 19:15:51.128822219 +0000 UTC m=+68.354234349"
        E0323 19:15:54.584887      95 resource_quota_controller.go:460] "Error during resource discovery" err="unable
      to retrieve the complete list of server APIs: metrics.k8s.io/v1beta1: stale GroupVersion discovery:
      metrics.k8s.io/v1beta1"
        I0323 19:15:54.632588      95 garbagecollector.go:792] "failed to discover some groups"
      groups="map[\"metrics.k8s.io/v1beta1\":\"stale GroupVersion discovery: metrics.k8s.io/v1beta1\"]"
        W0323 19:15:56.050813      95 handler_proxy.go:99] no RequestInfo found in the context
        E0323 19:15:56.050927      95 controller.go:113] "Unhandled Error" err="loading OpenAPI spec for
      \"v1beta1.metrics.k8s.io\" failed with: Error, could not get list of group versions for APIService"
        I0323 19:15:56.050944      95 controller.go:126] OpenAPI AggregationController: action for item
      v1beta1.metrics.k8s.io: Rate Limited Requeue.
        W0323 19:15:56.051992      95 handler_proxy.go:99] no RequestInfo found in the context
        E0323 19:15:56.052121      95 controller.go:102] "Unhandled Error" err=<
        loading OpenAPI spec for "v1beta1.metrics.k8s.io" failed with: failed to download v1beta1.metrics.k8s.io:
      failed to retrieve openAPI spec, http error: ResponseCode: 503, Body: service unavailable
        , Header: map[Content-Type:[text/plain; charset=utf-8] X-Content-Type-Options:[nosniff]]
        >
        I0323 19:15:56.052134      95 controller.go:109] OpenAPI AggregationController: action for item
      v1beta1.metrics.k8s.io: Rate Limited Requeue.
        I0323 19:16:25.655634      95 pod_startup_latency_tracker.go:108] "Observed pod startup duration" pod="kube-
      system/coredns-7566b5ff58-tx8h7" podStartSLOduration=28.907583805 podStartE2EDuration="1m36.655419772s"
      podCreationTimestamp="2026-03-23 19:14:49 +0000 UTC" firstStartedPulling="2026-03-23 19:15:08.863062269 +0000
      UTC m=+31.494696068" lastFinishedPulling="2026-03-23 19:16:24.473339812 +0000 UTC m=+99.242532035"
      observedRunningTime="2026-03-23 19:16:25.655324087 +0000 UTC m=+100.424516310" watchObservedRunningTime="2026-
      03-23 19:16:25.655419772 +0000 UTC m=+100.424611995"
        E0323 19:16:27.045192      95 resource_quota_controller.go:460] "Error during resource discovery" err="unable
      to retrieve the complete list of server APIs: metrics.k8s.io/v1beta1: stale GroupVersion discovery:
      metrics.k8s.io/v1beta1"
        I0323 19:16:27.096992      95 garbagecollector.go:792] "failed to discover some groups"
      groups="map[\"metrics.k8s.io/v1beta1\":\"stale GroupVersion discovery: metrics.k8s.io/v1beta1\"]"
        I0323 19:16:34.678303      95 pod_startup_latency_tracker.go:108] "Observed pod startup duration" pod="kube-
      system/local-path-provisioner-6bc6568469-hk6tl" podStartSLOduration=28.146023392
      podStartE2EDuration="1m45.678286787s" podCreationTimestamp="2026-03-23 19:14:49 +0000 UTC"
      firstStartedPulling="2026-03-23 19:15:08.899546198 +0000 UTC m=+31.531180006" lastFinishedPulling="2026-03-23
      19:16:34.294251179 +0000 UTC m=+109.063443401" observedRunningTime="2026-03-23 19:16:34.678064102 +0000 UTC
      m=+109.447256334" watchObservedRunningTime="2026-03-23 19:16:34.678286787 +0000 UTC m=+109.447479010"
        E0323 19:16:48.356722      95 handler_proxy.go:143] error resolving kube-system/metrics-server: no endpoints
      available for service "metrics-server"
        E0323 19:16:59.564625      95 resource_quota_controller.go:460] "Error during resource discovery" err="unable
      to retrieve the complete list of server APIs: metrics.k8s.io/v1beta1: stale GroupVersion discovery:
      metrics.k8s.io/v1beta1"
        I0323 19:16:59.622480      95 garbagecollector.go:792] "failed to discover some groups"
      groups="map[\"metrics.k8s.io/v1beta1\":\"stale GroupVersion discovery: metrics.k8s.io/v1beta1\"]"
```

### Agent-First Checklist

- [x] I pointed my agent at the repo and had it investigate this issue
- [x] I loaded relevant skills (e.g., `debug-openshell-cluster`, `debug-inference`, `openshell-cli`)
- [x] My agent could not resolve this — the diagnostic above explains why

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

K8s namespace not ready #552

Agent Diagnostic

Description

Reproduction Steps

Environment

Logs

Agent-First Checklist

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

K8s namespace not ready #552

Description

Agent Diagnostic

Description

Reproduction Steps

Environment

Logs

Agent-First Checklist

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions