Skip to content

fix(verifier): align ACME with upstream proto & defer ITA to final step#39

Merged
kingsleydon merged 4 commits intomainfrom
fix/verifier-error-handling
Apr 3, 2026
Merged

fix(verifier): align ACME with upstream proto & defer ITA to final step#39
kingsleydon merged 4 commits intomainfrom
fix/verifier-error-handling

Conversation

@kingsleydon
Copy link
Copy Markdown
Collaborator

@kingsleydon kingsleydon commented Apr 3, 2026

Summary

  • ACME schema aligned with upstream gateway proto: The upstream AcmeInfoResponse removed active_cert, base_domain, and hist_keys fields, and added account_attestation. Updated AcmeInfoSchema to accept both old and new formats, with hist_keys auto-derived from quoted_hist_keys via Zod transform.
  • Gateway base_domain resolution: Added getBaseDomain() that tries the legacy ACME field first, then falls back to the /.dstack/info endpoint for newer gateways.
  • Deferred ITA execution: Extracted Intel Trust Authority verification from verifyHardware() in all 3 verifiers (KMS, Gateway, App) and moved it to the end of the verification chain. ITA runs in parallel via Promise.allSettled() and is evaluated per-verifier — a failure in one verifier does not block ITA for others. Skipped entirely on hard errors (thrown exceptions).

Changes

  • AcmeInfoSchema — aligned with upstream proto, .transform() derives hist_keys
  • GatewayInfoSchema — new schema for /.dstack/info response
  • GatewayVerifier — added getBaseDomain(), cached getAcmeInfo() and getBaseDomain() to avoid redundant HTTP calls
  • KmsVerifier / PhalaCloudVerifier — removed inline ITA, exposed lastQuoteHex
  • verifierChain.ts — added runDeferredIta() with parallel execution and per-verifier eligibility
  • DataObjectCollector — added updateObjectFields() for partial field updates

Test plan

  • Verify new gateways (no active_cert/base_domain) no longer throw ACME parse errors
  • Verify old gateways (with legacy fields) remain compatible
  • Confirm ITA runs only after all verification steps complete
  • Confirm ITA is skipped on hard errors but still runs per-verifier when only some verifiers have failures
  • Confirm ITA results are correctly written to CPU DataObject intel_trust_authority field
  • Confirm warning is logged if CPU DataObject is missing when ITA tries to update

🤖 Generated with Claude Code

@vercel
Copy link
Copy Markdown

vercel bot commented Apr 3, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
trust-center Ready Ready Preview, Comment Apr 3, 2026 3:31am

Request Review

Comment on lines +281 to +285
collector.updateObjectFields(cpuObjectId, {
intel_trust_authority: JSON.stringify(itaResult),
})
console.log(`[VerifierChain] ITA result added to ${cpuObjectId}`)
}

This comment was marked as outdated.

Comment on lines +309 to +311
if (this.cachedBaseDomain) {
return this.cachedBaseDomain
}

This comment was marked as outdated.

Comment on lines +178 to +184
.transform((data) => ({
...data,
// Derive hist_keys from quoted_hist_keys if not provided by legacy API
hist_keys:
data.hist_keys ??
data.quoted_hist_keys.map((k) => k.public_key),
}))

This comment was marked as outdated.

@kingsleydon kingsleydon changed the title fix(verifier): align ACME with upstream proto & defer ITA fix(verifier): align ACME with upstream proto & defer ITA to final step Apr 3, 2026
kingsleydon and others added 3 commits April 2, 2026 19:59
… final step

- Update AcmeInfoSchema to match upstream gateway proto: remove required
  active_cert/base_domain (legacy optional), add account_attestation and
  quoted_hist_keys[].attestation, derive hist_keys from quoted_hist_keys
- Add GatewayInfoSchema and getBaseDomain() to fetch base_domain from
  /.dstack/info endpoint when not available in legacy ACME response
- Extract ITA from verifyHardware() in all 3 verifiers (KMS, Gateway, App)
  and run it as the final step in executeVerifiers via runDeferredIta()
- Skip ITA entirely if prior verification steps produced errors (early drop)
- Add DataObjectCollector.updateObjectFields() for partial field updates
  to inject ITA results into existing CPU DataObjects

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Cache getAcmeInfo() and getBaseDomain() results to avoid redundant
  HTTP calls during a single verification run (6-8 calls → 1 each)
- ITA skip condition now checks both errors AND failures, not just errors
- Parallelize ITA calls across verifiers with Promise.allSettled()

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- ITA now runs independently per verifier based on whether it produced a
  quote (lastQuoteHex), not on global zero-failure condition. A failure
  in Gateway verification no longer blocks ITA for KMS/App verifiers.
- Only skip ITA entirely on hard errors (thrown exceptions in the chain)
- Check updateObjectFields() return value and warn if CPU DataObject
  is missing instead of silently dropping the ITA result

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Comment on lines +175 to +176
active_cert: z.string().optional(),
base_domain: z.string().optional(),
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: The active_cert field was made optional, but the verifyCertificateKey function doesn't handle its absence, causing verification to always fail for new gateways that don't provide this field.
Severity: HIGH

Suggested Fix

Update the verification logic to handle cases where active_cert is not provided. A conditional check should be added to skip the verifyCertificateKey step if acmeInfo.active_cert is absent, as this is a valid state for new gateways.

Prompt for AI Agent
Review the code at the location below. A potential bug has been identified by an AI
agent.
Verify if this is a real issue. If it is, propose a fix; if not, explain why it's not
valid.

Location: packages/verifier/src/schemas.ts#L175-L176

Potential issue: The PR updated the ACME info schema to make the `active_cert` field
optional to support new gateways. However, the verification logic in the
`verifyCertificateKey` function was not updated to handle this change. The function
still requires `active_cert` to be present and returns an error if it is `undefined`. As
a result, when the system processes information from a new gateway that legitimately
omits the `active_cert`, the `verifyCertificateKey` function will incorrectly fail,
causing the entire certificate key verification to fail for these valid new gateways.

…eful degradation

- Remove global chainErrors gate from runDeferredIta — each verifier's
  ITA eligibility is determined solely by whether it produced a quote
- Skip certificateKey verification gracefully on new gateways where
  active_cert is absent, instead of producing a false failure
- Wrap getAcmeInfo() in try-catch within getBaseDomain() so ACME
  endpoint failure does not block /.dstack/info fallback
- Include verifier name in ITA failure logs for easier debugging

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@kingsleydon kingsleydon merged commit 68c41a2 into main Apr 3, 2026
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant