chore: make smoketests faster via test.concurrent, no sharding, 400% workers by jason-rl · Pull Request #773 · runloopai/api-client-ts

jason-rl · 2026-04-04T01:35:17Z

User description

⚠️ PR Title Must Follow Conventional Commits

Format: feat[optional scope]: <description>

Examples: feat: add new SDK method · feat(storage): support file uploads · feat!: breaking API change

Description

Motivation

Changes

Testing

Unit tests added
Integration tests added
Smoke Tests added/updated
Tested locally

Breaking Changes

Checklist

PR title follows Conventional Commits format (feat: or feat(scope):)
Documentation updated (if needed)
Breaking changes documented (if applicable)

CodeAnt-AI Description

Speed up smoke tests and run them in parallel

What Changed

Smoke tests now run in parallel, which shortens the time needed to complete the suite
Smoke test runs use a higher worker count to finish sooner
The smoke test workflow no longer splits tests across shards or collects shard logs; it runs as one parallel job instead
Coverage smoke-test runs now fail normally when tests fail, and the extra failure step was removed

Impact

✅ Shorter smoke test runs
✅ Faster CI feedback
✅ Fewer delayed test failures

💡 Usage Guide

Checking Your Pull Request

Every time you make a pull request, our system automatically looks through it. We check for security issues, mistakes in how you're setting up your infrastructure, and common code problems. We do this to make sure your changes are solid and won't cause any trouble later.

Talking to CodeAnt AI

Got a question or need a hand with something in your pull request? You can easily get in touch with CodeAnt AI right here. Just type the following in a comment on your pull request, and replace "Your question here" with whatever you want to ask:

@codeant-ai ask: Your question here

This lets you have a chat with CodeAnt AI about your pull request, making it easier to understand and improve your code.

Example

@codeant-ai ask: Can you suggest a safer alternative to storing this secret?

Preserve Org Learnings with CodeAnt

You can record team preferences so CodeAnt AI applies them in future reviews. Reply directly to the specific CodeAnt AI suggestion (in the same thread) and replace "Your feedback here" with your input:

@codeant-ai: Your feedback here

This helps CodeAnt AI learn and adapt to your team's coding style and standards.

Example

@codeant-ai: Do not flag unused imports.

Retrigger review

Ask CodeAnt AI to review the PR again, by typing:

@codeant-ai: review

Check Your Repository Health

To analyze the health of your code repository, visit our dashboard at https://app.codeant.ai. This tool helps you identify potential issues and areas for improvement in your codebase, ensuring your repository maintains high standards of code health.

github-actions · 2026-04-04T01:41:06Z

✅ Object Smoke Tests & Coverage Report

Test Results

✅ All smoke tests passed

Coverage Results

Metric	Coverage	Required	Status
Functions	100%	100%	✅
Lines	90.03%	-	ℹ️
Branches	68.95%	-	ℹ️
Statements	88.98%	-	ℹ️

Coverage Requirement: 100% function coverage (all public methods must be called in smoke tests)

✅ All tests passed and all object methods are covered!

View detailed coverage report

File	Functions	Lines	Branches
src/sdk.ts	✅ 100%	86.11%	72.58%
src/sdk/agent.ts	✅ 100%	100%	100%
src/sdk/axon.ts	✅ 100%	93.75%	100%
src/sdk/blueprint.ts	✅ 100%	100%	80%
src/sdk/devbox.ts	✅ 100%	91.96%	94.28%
src/sdk/execution-result.ts	✅ 100%	92.68%	70.83%
src/sdk/execution.ts	✅ 100%	95.65%	87.5%
src/sdk/gateway-config.ts	✅ 100%	100%	100%
src/sdk/mcp-config.ts	✅ 100%	100%	100%
src/sdk/network-policy.ts	✅ 100%	100%	100%
src/sdk/scenario-builder.ts	✅ 100%	98.46%	80.7%
src/sdk/scenario-run.ts	✅ 100%	92.1%	50%
src/sdk/scenario.ts	✅ 100%	100%	100%
src/sdk/scorer.ts	✅ 100%	100%	100%
src/sdk/secret.ts	✅ 100%	100%	100%
src/sdk/snapshot.ts	✅ 100%	100%	100%
src/sdk/storage-object.ts	✅ 100%	80%	48.93%

📋 View workflow run

codeant-ai · 2026-04-04T02:10:35Z

CodeAnt AI is reviewing your PR.

Thanks for using CodeAnt! 🎉

We're free for open-source projects. if you're enjoying it, help us grow by sharing.

Share on X ·
Reddit ·
LinkedIn

codeant-ai · 2026-04-04T02:13:15Z

tests/smoketests/object-oriented/axon.test.ts

    });

-    test(
+    test.concurrent(


Suggestion: This reconnect test explicitly requires an idle SSE period to trigger a 408/reconnect cycle, but running it with test.concurrent allows other tests in the same suite to publish events to the same axon during that idle window, breaking the reconnect precondition and causing flaky timeouts or false negatives. Run this test non-concurrently to preserve its timing assumptions. [race condition]

Severity Level: Major ⚠️

- ❌ SSE reconnect smoke test becomes flaky or times out. - ⚠️ Axon SSE reconnect behavior coverage becomes unreliable.

Suggested change

test.concurrent(

test(

Steps of Reproduction ✅

1. Run the smoketests with RUN_SMOKETESTS enabled so the top-level describe in `tests/smoketests/object-oriented/axon.test.ts:8` is active instead of skipped. 2. In the `describe('axon lifecycle')` block (`axon.test.ts:9`), the shared `axon` instance is created once in `beforeAll` (`axon.test.ts:13-18`) via `sdk.axon.create()`, and all tests in this describe share this same axon and axonId. 3. Jest schedules all `test.concurrent` tests in this describe to run concurrently, including: - the reconnect test at `axon.test.ts:79-154` ("SSE subscribe reconnects after idle (408)..."), - the single-publish test at `axon.test.ts:48-59` ("publish event to axon"), - the multi-publish test at `axon.test.ts:61-77` ("publish multiple events and verify sequence increases"), - and the SSE stream test at `axon.test.ts:156-178` ("subscribe to SSE stream and receive events"), all of which call `axon.publish` on the same `axon`. 4. When the reconnect test runs, it first publishes events 1–5 (`axon.test.ts:82-90`), opens an SSE stream (`subscribeSse` at `axon.test.ts:92`), then intentionally waits ~70 seconds without server traffic (`setTimeout` at `axon.test.ts:100-104` with comment "Hold the line open with no server traffic so the backend can return 408"). During this intended idle window, any concurrently running test (e.g., `test.concurrent('publish event to axon', ...)` at `axon.test.ts:48-54` or `test.concurrent('subscribe to SSE stream and receive events', ...)` at `axon.test.ts:156-163`) may publish additional events to the same axon, preventing the stream from going idle, so the backend never returns 408. This causes the reconnect test to either exceed `MEDIUM_TIMEOUT` (`axon.test.ts:153-154`) or hit the guard `if (sequences.length > 150) throw` (`axon.test.ts:141-142`), yielding flaky timeouts/false negatives that disappear if the reconnect test is run alone or non-concurrently.

Prompt for AI Agent 🤖

This is a comment left during a code review. **Path:** tests/smoketests/object-oriented/axon.test.ts **Line:** 79:79 **Comment:** *Race Condition: This reconnect test explicitly requires an idle SSE period to trigger a 408/reconnect cycle, but running it with `test.concurrent` allows other tests in the same suite to publish events to the same axon during that idle window, breaking the reconnect precondition and causing flaky timeouts or false negatives. Run this test non-concurrently to preserve its timing assumptions. Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise.

👍 | 👎

codeant-ai · 2026-04-04T02:13:16Z

jest.config.ts

  ],
  // Add display name for smoke tests to make it clearer in output
  ...((runSmoketests || runBuiltPackageTest) && { displayName: 'Smoke Tests' }),
+  ...(runSmoketests && { maxWorkers: 50 }),


Suggestion: Using a fixed maxWorkers: 50 for smoke tests can oversubscribe low-core CI/local machines and trigger worker spawn/resource exhaustion failures (for example spawn EAGAIN) or severe process thrashing. Use a CPU-relative value so worker count scales with available cores. [possible bug]

Severity Level: Major ⚠️

- ⚠️ Local `yarn test:smoke` can oversubscribe low-core machines. - ⚠️ Heavy smoke suites may thrash or intermittently fail locally. - ⚠️ Developer smoke-test feedback loop slowed or destabilized.

Suggested change

...(runSmoketests && { maxWorkers: 50 }),

...(runSmoketests && { maxWorkers: '50%' }),

Steps of Reproduction ✅

1. On a low-core developer machine (e.g., 2 CPUs), run `yarn test:smoke` from the repository root, which executes the script defined in `package.json:21` as `"test:smoke": "RUN_SMOKETESTS=1 jest --verbose tests/smoketests"`. 2. The Jest CLI starts without an explicit `--maxWorkers` flag, so it loads the default configuration from `jest.config.ts:6-9`, where `runSmoketests` is set based on `process.env['RUN_SMOKETESTS']` at `jest.config.ts:3`. 3. Because `RUN_SMOKETESTS=1` is set by the script, `runSmoketests` is `true` and the spread at `jest.config.ts:38` applies, forcing `maxWorkers: 50` into the Jest configuration for this run. 4. Jest now attempts to run the smoke test suite in `tests/smoketests/**` (triggered by the script at `package.json:21`) with up to 50 workers on a machine with far fewer cores, causing significant oversubscription; this can manifest as heavy CPU thrashing or, on constrained environments with low process limits, worker-spawn failures such as `spawn EAGAIN`, degrading or breaking local smoke-test runs.

Prompt for AI Agent 🤖

This is a comment left during a code review. **Path:** jest.config.ts **Line:** 38:38 **Comment:** *Possible Bug: Using a fixed `maxWorkers: 50` for smoke tests can oversubscribe low-core CI/local machines and trigger worker spawn/resource exhaustion failures (for example `spawn EAGAIN`) or severe process thrashing. Use a CPU-relative value so worker count scales with available cores. Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise.

👍 | 👎

codeant-ai · 2026-04-04T02:14:02Z

tests/smoketests/examples/examples.test.ts

  }

-  test(
+  test.concurrent(


Suggestion: This test temporarily deletes GITHUB_TOKEN and ANTHROPIC_API_KEY from process.env, so running it with test.concurrent introduces a race with live example tests that read the same global env vars. That can make unrelated live examples fail nondeterministically depending on execution order. Run this test serially to avoid shared-state interference. [race condition]

Severity Level: Major ⚠️

- ⚠️ `mcp-github-tools` smoketest can flake under live-example runs. - ⚠️ `devbox-mounts` smoketest can intermittently fail due to missing env. - ⚠️ `yarn test:examples` CI job becomes flaky and unreliable.

Suggested change

test.concurrent(

test(

Steps of Reproduction ✅

1. Configure environment to run live examples by setting `RUN_EXAMPLE_LIVE_TESTS=1` plus valid `RUNLOOP_API_KEY`, `GITHUB_TOKEN`, and `ANTHROPIC_API_KEY`. These variables are read in `tests/smoketests/examples/examples.test.ts:5-10` and the required env for each example (including `GITHUB_TOKEN` and `ANTHROPIC_API_KEY` for `mcp-github-tools`) is defined in `examples/registry.ts:22-63`. 2. Run `yarn test:examples`, which invokes `RUN_SMOKETESTS=1 jest --verbose tests/smoketests/examples/examples.test.ts` as defined in `package.json:20-23`. Jest executes the `describe('smoketest: examples', ...)` block in `tests/smoketests/examples/examples.test.ts:4-25`, registering one concurrent test per `exampleRegistry` entry via `const itLive = runLiveExample ? test.concurrent : test.concurrent.skip;` at line 10 and `itLive(...` at lines 12-24. 3. Because the deterministic skip test is also declared with `test.concurrent(` at `tests/smoketests/examples/examples.test.ts:27`, Jest runs it concurrently with the live example tests. Inside this test, `process.env['GITHUB_TOKEN']` and `process.env['ANTHROPIC_API_KEY']` are saved (lines 35-37) and then deleted globally at lines 38-39 before calling `mcpExampleEntry.run({ skipIfMissingCredentials: true })` at line 42. 4. While those environment variables are temporarily deleted, other concurrent tests in the same file invoke `entry.run()` for live examples such as `devbox-mounts` and `mcp-github-tools` (registered in `examples/registry.ts:22-63`). Their implementations read `process.env['GITHUB_TOKEN']` and `process.env['ANTHROPIC_API_KEY']` at runtime in `examples/mcp-github-tools.ts:18-32,118-135` and `examples/devbox-mounts.ts:81-87`. If these reads occur during the deletion window created by the concurrent skip-mode test, the examples throw errors like "Set GITHUB_TOKEN to a GitHub PAT with repo scope." or "Set ANTHROPIC_API_KEY to run the Claude Code mount example.", causing nondeterministic failures in the corresponding smoketests despite credentials being correctly configured.

Prompt for AI Agent 🤖

This is a comment left during a code review. **Path:** tests/smoketests/examples/examples.test.ts **Line:** 27:27 **Comment:** *Race Condition: This test temporarily deletes `GITHUB_TOKEN` and `ANTHROPIC_API_KEY` from `process.env`, so running it with `test.concurrent` introduces a race with live example tests that read the same global env vars. That can make unrelated live examples fail nondeterministically depending on execution order. Run this test serially to avoid shared-state interference. Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise.

👍 | 👎

codeant-ai · 2026-04-04T02:15:13Z

tests/smoketests/object-oriented/gateway-config.test.ts

      });

-      test('concurrent requests', async () => {
+      test.concurrent('concurrent requests', async () => {


Suggestion: This load-oriented test already issues parallel requests internally, and making the test itself concurrent adds another layer of shared-suite parallelism against the same gateway/devbox fixture. That can trigger transient throttling/contention and make the expected all-200 assertion nondeterministic. Keep this test serial to isolate its own concurrency signal. [race condition]

Severity Level: Major ⚠️

- ❌ Concurrent-requests smoketest may intermittently fail under load. - ⚠️ Harder to trust gateway concurrency regression coverage.

Suggested change

test.concurrent('concurrent requests', async () => {

test('concurrent requests', async () => {

Steps of Reproduction ✅

1. Enable the smoketest suite by setting `RUN_SMOKETESTS=1` and required gateway credentials, then execute the Jest tests including `tests/smoketests/object-oriented/gateway-config.test.ts`. 2. In `describe('comprehensive gateway proxying tests', ...)` at lines 282–714, a shared `devbox`, `gatewayConfig`, `networkPolicy`, `gatewayUrl`, and `gatewayToken` are created once in `beforeAll` (lines 333–379) and reused by all tests in this block; cleanup happens in `afterAll` (lines 381–398). 3. Most tests in this describe are declared with `test.concurrent(...)` and exercise the same shared `devbox` and gateway via `curlRequest` or `devbox!.cmd.exec` (see, for example, 'GET request - list devboxes' at 400–407, 'large response handling' at 478–485, 'multipart form data' at 531–556, 'large request body' at 583–600). 4. The `test.concurrent('concurrent requests', async () => { ... })` at lines 621–639 builds a Bash script (lines 622–629) that itself issues three background `curl` requests in parallel to `${gatewayUrl}/v1/devboxes?limit=1` using the shared `gatewayToken`, then asserts that every captured HTTP status code is `200` at lines 635–637. Because this test is both internally concurrent (three curls at once) and marked `test.concurrent` (running alongside other gateway tests that hit the same `devbox` and gateway), the combined concurrency can trigger transient throttling or contention on the gateway/devbox, causing some responses to be non-200 even though the product behavior is acceptable under normal load. This makes the "all 200" assertion flaky. Running this particular test non-`concurrent` isolates its intended concurrency behavior to the inner Bash loop, reducing nondeterministic failures due to shared-suite contention.

Prompt for AI Agent 🤖

This is a comment left during a code review. **Path:** tests/smoketests/object-oriented/gateway-config.test.ts **Line:** 621:621 **Comment:** *Race Condition: This load-oriented test already issues parallel requests internally, and making the test itself `concurrent` adds another layer of shared-suite parallelism against the same gateway/devbox fixture. That can trigger transient throttling/contention and make the expected all-200 assertion nondeterministic. Keep this test serial to isolate its own concurrency signal. Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise.

👍 | 👎

codeant-ai · 2026-04-04T02:15:14Z

tests/smoketests/object-oriented/gateway-config.test.ts

      });

-      test('request timeout handling', async () => {
+      test.concurrent('request timeout handling', async () => {


Suggestion: This test now runs concurrently against the same shared devbox used by many other tests in the suite, but it asserts a strict wall-clock upper bound (elapsed < 30000). Under concurrent execution, command scheduling/queueing latency can inflate elapsed time and cause false failures even when the endpoint is healthy. Run this timing-sensitive test serially. [race condition]

Severity Level: Major ⚠️

- ❌ Timeout smoketest may intermittently fail despite healthy gateway. - ⚠️ Smoketest timing assertions become unreliable under suite load.

Suggested change

test.concurrent('request timeout handling', async () => {

test('request timeout handling', async () => {

Steps of Reproduction ✅

1. Enable the smoketest suite by setting `RUN_SMOKETESTS=1` and required gateway credentials (`RUNLOOP_API_KEY`, optional `RUNLOOP_BASE_URL`), then run the Jest tests that include `tests/smoketests/object-oriented/gateway-config.test.ts`. 2. In this file, the `describe('comprehensive gateway proxying tests', ...)` block at lines 282–714 sets up a single shared `devbox`, `gatewayConfig`, `networkPolicy`, `gatewayUrl`, and `gatewayToken` in `beforeAll` (lines 333–379) and tears them down in `afterAll` (lines 381–398). 3. Inside that same `describe`, many tests are declared with `test.concurrent(...)` (e.g., 'GET request - list devboxes' at lines 400–407, 'large response handling' at 478–485, 'multipart form data' at 531–556, 'large request body' at 583–600). Jest will run these tests concurrently against the shared `devbox` and gateway. 4. The `test.concurrent('request timeout handling', async () => { ... })` at lines 656–671 uses the shared `curlRequest` helper (lines 293–331) to POST to `/v1/devboxes/${devbox!.id}/execute_sync` with a `sleep 2` command, measuring wall-clock time via `const startTime = Date.now()` and `const elapsed = Date.now() - startTime`, then asserting `expect(elapsed).toBeGreaterThan(2000)` and `expect(elapsed).toBeLessThan(30000)`. Under concurrent execution with other `test.concurrent` cases that also call `devbox!.cmd.exec` or `curlRequest`, scheduling/queueing delays on the shared `devbox` or gateway can legitimately push `elapsed` over 30000 ms while still returning `httpCode === 200`, causing this test to fail even though the endpoint behavior is correct. Running this particular test serially (non-`concurrent`) removes the extra suite-level contention and reduces this flakiness risk.

Prompt for AI Agent 🤖

This is a comment left during a code review. **Path:** tests/smoketests/object-oriented/gateway-config.test.ts **Line:** 656:656 **Comment:** *Race Condition: This test now runs concurrently against the same shared `devbox` used by many other tests in the suite, but it asserts a strict wall-clock upper bound (`elapsed < 30000`). Under concurrent execution, command scheduling/queueing latency can inflate elapsed time and cause false failures even when the endpoint is healthy. Run this timing-sensitive test serially. Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise.

👍 | 👎

codeant-ai · 2026-04-04T02:15:42Z