Enhance Multimodal Support in LLM Workflow and Update Documentation by eanzhao · Pull Request #45 · aevatarAI/aevatar

eanzhao · 2026-03-16T07:07:58Z

Expanded the LLM workflow to support multimodal input and output, including text, images, audio, and video.
Updated the ChatRequestEvent and ChatResponseEvent to include input_parts and output_parts for handling diverse content types.
Introduced new ContentPart and MediaContentEvent classes to encapsulate various media types and their properties.
Refactored the ChatRuntime and RoleGAgent to process and emit multimodal content effectively.
Enhanced documentation to reflect the new capabilities and provide clear guidelines for using multimodal features in workflows.

- Expanded the LLM workflow to support multimodal input and output, including text, images, audio, and video. - Updated the `ChatRequestEvent` and `ChatResponseEvent` to include `input_parts` and `output_parts` for handling diverse content types. - Introduced new `ContentPart` and `MediaContentEvent` classes to encapsulate various media types and their properties. - Refactored the `ChatRuntime` and `RoleGAgent` to process and emit multimodal content effectively. - Enhanced documentation to reflect the new capabilities and provide clear guidelines for using multimodal features in workflows.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: f5a50b5e50

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-03-16T07:14:45Z

src/Aevatar.AI.Core/Tools/ToolCallLoop.cs

+                TryGetStringByKeys(root, "image_base64", "imageBase64") ??
+                TryGetNestedMediaBase64(root, "image");


Restore legacy base64 aliases in tool media parser

This parser no longer checks the previously supported root aliases base64/data when extracting image payloads, so tool outputs that still return { "base64": "..." } will now fail TryExtractToolContentParts and be sent back as plain text JSON instead of multimodal ContentParts. That regresses existing tool-call flows that relied on the old schema and causes image outputs to be dropped from subsequent LLM turns.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-03-16T07:14:45Z

src/workflow/Aevatar.Workflow.Infrastructure/CapabilityApi/ChatRunRequestNormalizer.cs

+            if (!TryParseContentPartKind(part.Type, out var kind))
+                continue;


Reject chat requests when all inputParts are unsupported

Unsupported inputParts are silently skipped here, and the normalizer still returns success even if every part is dropped; combined with the new prompt-or-inputParts gate in ChatEndpoints/WebSocket parser (which only checks raw part count), requests like inputParts:[{"type":"foo"}] now get accepted and execute with an empty derived prompt. This should be treated as INVALID_PROMPT (or invalid part type) instead of dispatching a blank run.

Useful? React with 👍 / 👎.

- Added a new CI job for testing and building the console-web application. - Updated the `console_web` output in the CI workflow to include relevant paths. - Introduced a new environment variable `AEVATAR_CONSOLE_PUBLIC_PATH` for configuring deployment paths. - Refactored the public path resolution logic in the console-web configuration. - Removed deprecated enriched graph API and related decoding logic from the console API. - Updated authentication configuration to disable NyxID login when required environment variables are missing.

- Added type-checking step for console-web in the CI workflow. - Removed redundant pnpm setup step to streamline the workflow. - Updated architecture scorecard documentation to reflect successful compliance with all architecture guards. - Fixed naming issues and improved clarity in documentation regarding project structure and CI pipeline integration.

…multimodal-llm-support # Conflicts: # apps/aevatar-console-web/src/app.tsx # apps/aevatar-console-web/src/pages/actors/index.tsx # apps/aevatar-console-web/src/pages/observability/index.tsx # apps/aevatar-console-web/src/pages/overview/index.tsx # apps/aevatar-console-web/src/pages/playground/index.test.tsx # apps/aevatar-console-web/src/pages/playground/index.tsx # apps/aevatar-console-web/src/pages/primitives/index.tsx # apps/aevatar-console-web/src/pages/runs/index.tsx # apps/aevatar-console-web/src/pages/settings/index.tsx # apps/aevatar-console-web/src/pages/studio/components/StudioShell.test.tsx # apps/aevatar-console-web/src/pages/studio/index.test.tsx # apps/aevatar-console-web/src/pages/workflows/index.tsx # apps/aevatar-console-web/src/pages/yaml/index.test.tsx # apps/aevatar-console-web/src/pages/yaml/index.tsx # apps/aevatar-console-web/src/shared/api/consoleApi.ts # apps/aevatar-console-web/src/shared/api/decoders.ts

- Created WORKFLOW.md to define project workflow, including issue tracking, execution flow, and verification expectations. - Added start-local.sh script to set up the local development environment, ensuring required commands are available and configuring necessary environment variables.

chatgpt-codex-connector bot reviewed Mar 16, 2026

View reviewed changes

eanzhao changed the base branch from docs/2026-03-14_gagent-service-phase-1-design to dev March 21, 2026 06:05

eanzhao added 5 commits March 21, 2026 14:18

Merge origin/dev and fix multimodal chat integration

92ed94f

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enhance Multimodal Support in LLM Workflow and Update Documentation#45

Enhance Multimodal Support in LLM Workflow and Update Documentation#45
eanzhao wants to merge 6 commits intodevfrom
codex/feat/2026-03-16_multimodal-llm-support

eanzhao commented Mar 16, 2026

Uh oh!

chatgpt-codex-connector bot left a comment

Uh oh!

chatgpt-codex-connector bot Mar 16, 2026

Uh oh!

chatgpt-codex-connector bot Mar 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

		TryGetStringByKeys(root, "image_base64", "imageBase64") ??
		TryGetNestedMediaBase64(root, "image");

		if (!TryParseContentPartKind(part.Type, out var kind))
		continue;

Conversation

eanzhao commented Mar 16, 2026

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector bot Mar 16, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector bot Mar 16, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant