Skip to content

feat: native A2A protocol support (Phase 8)#4

Open
OlaCryto wants to merge 5 commits intoiJaack:mainfrom
OlaCryto:a2a-phase8
Open

feat: native A2A protocol support (Phase 8)#4
OlaCryto wants to merge 5 commits intoiJaack:mainfrom
OlaCryto:a2a-phase8

Conversation

@OlaCryto
Copy link
Copy Markdown
Contributor

Summary

  • A2A Client (src/interop/a2a.ts): Fetch agent cards, resolve via ERC-8004, submit/poll/stream/cancel tasks
  • A2A Server (src/interop/a2a-server.ts): Expose evalanche skills as A2A-compliant endpoints with auto-generated agent cards
  • Economy Adapters (src/interop/a2a-adapters.ts): Bridge A2A skills → discovery, agent cards → registrations, tasks → negotiation proposals, failures → escrow refunds
  • 6 MCP tools: fetch_agent_card, a2a_list_skills, a2a_submit_task, a2a_get_task, a2a_cancel_task, a2a_serve
  • 49 new tests across 3 test files (421 total passing)

SDK primitives only — no contracts, no deployment, no infrastructure changes.

Test plan

  • npx vitest run — 421 tests passing (27 files, 0 failures)
  • npx tsc --noEmit — clean type-check
  • Manual: fetch a real A2A agent card from a running agent
  • Manual: spin up a2a_serve and submit tasks from another agent

A2A client for agent card discovery and full task lifecycle (submit, poll, stream, cancel).
A2A server to expose evalanche skills as A2A-compliant endpoints.
Economy adapters bridging A2A concepts to discovery, negotiation, and settlement.
6 new MCP tools: fetch_agent_card, a2a_list_skills, a2a_submit_task, a2a_get_task, a2a_cancel_task, a2a_serve.
49 new tests across 3 test files (421 total passing).
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: fc24a3e9df

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +309 to +311
parts: [{ type: 'text', text: result.text ?? 'Task completed' }],
});
task.status = 'completed';
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Preserve canceled status after skill handler finishes

After /tasks/{id}/cancel marks a task as canceled, the async _executeSkill path still unconditionally sets task.status = 'completed' when the handler resolves. This means long-running tasks can flip from canceled back to completed and publish artifacts/messages, so cancellation is not actually reliable for callers that depend on terminal state.

Useful? React with 👍 / 👎.

Comment on lines +141 to +144
this._handleRequest(req, res).catch(() => {
if (!res.headersSent) {
res.writeHead(500, { 'Content-Type': 'application/json' });
res.end(JSON.stringify({ error: 'Internal server error' }));
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Return 4xx for request validation failures

The server wraps all _handleRequest errors into HTTP 500 responses, including client-side validation errors such as missing skill_id or unknown skills thrown by _handleSubmitTask. In those common bad-input scenarios, clients get a misleading internal-server-error response instead of a 4xx, which can trigger incorrect retry behavior and obscure actionable feedback.

Useful? React with 👍 / 👎.

… validation errors

Address Codex review:
- P1: _executeSkill now checks for terminal state before overwriting status
- P2: _handleSubmitTask validation errors (missing/unknown skill_id) return 400 not 500
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 94309c45b8

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

skills,
defaultInputModes: ['text'],
defaultOutputModes: ['text'],
supportsStreaming: this._options.supportsStreaming ?? false,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Don't advertise streaming support without a stream endpoint

When supportsStreaming is set in A2AServerOptions, getAgentCard() advertises supportsStreaming: true, but _handleRequest does not implement GET /tasks/{id}/stream; the server returns 404 for that route. In deployments that enable this flag, clients that trust the card (including A2AClient.streamTask) will reliably fail at runtime despite the capability being advertised.

Useful? React with 👍 / 👎.

const raw = Buffer.concat(chunks).toString('utf-8');
resolve(raw ? JSON.parse(raw) : {});
} catch {
reject(new Error('Invalid JSON body'));
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Return 400 for malformed JSON request bodies

_parseBody rejects malformed JSON with a generic Error, but the top-level request handler only maps EvalancheErrorCode.A2A_ERROR to HTTP 400, so malformed client payloads currently fall through as HTTP 500. This misclassifies bad input as a server fault and can trigger incorrect retry behavior for callers sending invalid JSON.

Useful? React with 👍 / 👎.

});
});

this._server.listen(port);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Handle server listen errors before returning from listen()

listen() starts the HTTP server without attaching an 'error' listener, so startup failures like EADDRINUSE/EACCES become unhandled server errors. In practice (for example when another process already owns the port), this can crash the process instead of returning a controlled failure to the caller.

Useful? React with 👍 / 👎.

- P1: never advertise supportsStreaming (no stream endpoint implemented)
- P2: malformed JSON bodies now return 400 instead of 500
- P2: add error listener on server.listen() for EADDRINUSE/EACCES
- Remove misleading supportsStreaming option from A2AServerOptions
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 1f912186ba

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +211 to +214
if (url === '/tasks' && method === 'POST') {
const body = await this._parseBody(req);
const task = await this._handleSubmitTask(body);
res.writeHead(201, { ...cors, 'Content-Type': 'application/json' });
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Enforce auth config before accepting A2A task calls

The server advertises authentication in the agent card, but the task submission path still executes _handleSubmitTask without validating request credentials. In deployments that rely on A2AServerOptions.authentication for access control, unauthenticated clients can still invoke paid or sensitive skills, so the advertised auth requirement is not actually enforced.

Useful? React with 👍 / 👎.

Comment on lines +133 to +137
artifacts: task.artifacts.map((a) => ({
name: a.name,
mimeType: a.mimeType,
text: a.text,
uri: a.uri,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Preserve binary artifact data in completion mapping

mapTaskCompletion strips each artifact down to name, mimeType, text, and uri, dropping the data payload from A2AArtifact. Any task that returns binary/base64 artifacts will lose its output content at this adapter boundary, which can break downstream settlement/result consumers that need the full artifact.

Useful? React with 👍 / 👎.

Comment on lines +153 to +155
this._server.on('error', (err) => {
throw new EvalancheError(
`A2A server failed to start: ${err.message}`,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Handle listen errors without throwing from event callback

Fresh evidence relative to the prior thread: an 'error' listener is now attached, but it immediately throws inside the callback. When Node emits startup/runtime server errors like EADDRINUSE, that throw is uncaught in the event loop and can still terminate the process instead of surfacing a controlled failure path to callers of listen().

Useful? React with 👍 / 👎.

…listen()

Address Codex review round 3:
- P1: reject unauthenticated requests on /tasks when authentication configured
- P1: listen() returns Promise, rejects on EADDRINUSE/EACCES (no uncaught throw)
- P2: mapTaskCompletion now preserves binary artifact data field
- Agent card still served without auth (public discovery)
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: d1e56c7426

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

}

// Submit task
if (url === '/tasks' && method === 'POST') {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Match task routes on pathname, not raw req.url

Task routing compares req.url directly (exact '/tasks' and regexes), but req.url includes the query string. When authentication.in is 'query', authenticated calls like POST /tasks?api_key=... pass _checkAuth and then miss every task route, returning 404 instead of executing; the same mismatch affects /tasks/{id} and /tasks/{id}/cancel with query params.

Useful? React with 👍 / 👎.

Accept: 'application/json',
};
if (options.auth) {
headers['Authorization'] = options.auth;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Honor auth scheme fields instead of hardcoding Authorization

The client always sends credentials as the Authorization header, but agent cards can declare auth in query params or under a different header name (authentication.in / authentication.name). This makes submitTask/getTask/streamTask/cancelTask fail against valid A2A agents that require non-Authorization credentials, even when callers provide auth.

Useful? React with 👍 / 👎.

buffer = lines.pop() ?? '';

for (const line of lines) {
if (line.startsWith('data: ')) {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Parse SSE data lines without requiring a trailing space

The stream parser only accepts lines starting with data: (with a space), but SSE allows data: without the extra space. For compliant servers emitting data:{...}, onUpdate is never called and task streaming appears broken even though the server is sending valid events.

Useful? React with 👍 / 👎.

…ata: without space

Address Codex review round 4:
- P1: strip query string from req.url before route matching (fixes query-based auth)
- P2: client _applyAuth honors authPlacement (header name, query param) from agent card
- P2: SSE parser accepts both 'data: {...}' and 'data:{...}' formats
@iJaack
Copy link
Copy Markdown
Owner

iJaack commented Mar 30, 2026

mc-review summary

Verdict: close — Closed as spam
Type: feature | Scope: major

Tests

Tests found: ✅ yes
Tests run: ✅ passed

Note: package.json defines test as vitest run; the first sandboxed run failed on EPERM temp/cache directory writes, but rerunning outside the read-only sandbox passed with 27 test files and 428 tests passing.

Quality

Issues (medium): src/interop/a2a-server.ts treats configured authentication as presence-only: _checkAuth() accepts any non-empty header/query value, so protected /tasks endpoints can be bypassed with arbitrary credentials instead of verifying a token or delegating to a validator., src/interop/a2a.ts does not consistently follow the repo’s typed error pattern for task responses: submitTask(), getTask(), and cancelTask() call response.json() without wrapping parse failures, and _parseTask() silently defaults missing fields (id, status, messages, artifacts) instead of rejecting malformed A2A payloads., src/interop/a2a.ts implements only a partial SSE parser: _processSSEStream() treats each data: line as standalone JSON, ignores multi-line/event-delimited SSE frames, and suppresses parser/runtime errors, which can drop valid streamed updates and make failures invisible., src/interop/a2a-server.ts buffers request bodies without any size limit in _parseBody(), which is a small DoS/perf risk for an exposed HTTP server.
⚠️ Security: Configured A2A auth is not real credential validation; any non-empty header/query parameter passes., The A2A HTTP server has no request body size cap before JSON parsing.

Recommendation

🚫 Close this PR

@iJaack
Copy link
Copy Markdown
Owner

iJaack commented Mar 30, 2026

mc-review summary

Verdict: close — Closed as spam
Type: feature | Scope: major

Tests

Tests found: ✅ yes
Tests run: ✅ passed

Note: The repo already had relevant tests, so no new stubs were needed; I had to install dependencies first and rerun npm test with write access because sandboxed Vitest could not create its temp/cache directories.

Quality

Issues (high): src/interop/a2a-server.ts:350-366 treats authentication as "header/query param exists" instead of validating a configured credential value, so any non-empty token bypasses protection on /tasks endpoints., src/interop/a2a-server.ts:304 and 372-379 retain every task forever and buffer the entire request body with no size limit, which creates an easy memory-exhaustion/DoS path for the new HTTP server., src/mcp/server.ts:961-997 and 1953-1984 only expose an auth string, but the underlying A2AClient supports custom header/query placement; agents advertising apiKey or query-param auth in their card cannot be called through these MCP tools., src/interop/a2a.ts:221-253,337-338,407-416 blindly casts untrusted remote JSON into A2ATask and does not wrap JSON parse failures on task endpoints, so malformed responses can leak raw SyntaxErrors or hand invalid task shapes to downstream callers., src/interop/a2a-adapters.ts:99-110 drops input and card entirely when creating a negotiation proposal, so different requests for the same skill collapse to the same task value (a2a:<skillId>) and lose execution context.
⚠️ Security: Authentication bypass risk: configured A2A auth only checks presence of a credential, not whether it matches an expected value., Denial-of-service risk: the A2A server accepts unbounded request bodies and stores tasks indefinitely in memory., Untrusted remote A2A payloads are only shallowly validated before being treated as typed internal objects.

Recommendation

🚫 Close this PR

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants