Skip to content

feat: add image sending and receiving support to all claw channels#58

Open
varshinivij wants to merge 8 commits intoaristoteleo:mainfrom
varshinivij:main
Open

feat: add image sending and receiving support to all claw channels#58
varshinivij wants to merge 8 commits intoaristoteleo:mainfrom
varshinivij:main

Conversation

@varshinivij
Copy link
Copy Markdown

Wire end-to-end image support through the claw messaging gateway. Inbound images from all 7 channels (Telegram, Discord, Slack, WeChat, Feishu, QQ, iMessage) are now downloaded, encoded as base64 data-URIs, and sent to the chatroom as multimodal _llm_content. Outbound images from tool results (e.g. Python plots) are collected via a new image-aware step callback and delivered back to users through each platform's native image API. Includes 14 new tests covering the full round-trip pipeline.

@zqbake
Copy link
Copy Markdown
Collaborator

zqbake commented Apr 2, 2026

The scope of these changes looks well-contained. Have you tested this across all channels? If the testing is fully covered, we are good to merge

@zqbake
Copy link
Copy Markdown
Collaborator

zqbake commented Apr 3, 2026

let me know if you think its good to merge.


# ── Auto-start configured Claw channels ─────────────────────────────
try:
from pantheon.claw import ClawConfigStore
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pantheon claw alreay worked without this?

@Starlitnightly
Copy link
Copy Markdown
Collaborator

@varshinivij please fixed all test so that we can merge ur PR

varshinivij and others added 8 commits April 5, 2026 09:38
Wire end-to-end image support through the claw messaging gateway.
Inbound images from all 7 channels (Telegram, Discord, Slack, WeChat,
Feishu, QQ, iMessage) are now downloaded, encoded as base64 data-URIs,
and sent to the chatroom as multimodal _llm_content. Outbound images
from tool results (e.g. Python plots) are collected via a new image-aware
step callback and delivered back to users through each platform's native
image API. Includes 14 new tests covering the full round-trip pipeline.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
… send generated images

image_gen.py returned file paths but channel callbacks expected base64_uri in raw_content.
Added base64_uri + hidden_to_model to all three generation methods, matching the pattern
used by python_interpreter. Also fixed runtime to handle base64_uri as list, hoisted
notebook image URIs, and added claw auto-start support.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…preter to leader

- Fix _snapshot_images() to use _get_effective_workdir() so images created
  in workspace directories are detected and sent through claw channels
- Add python_interpreter toolset to leader agent in default team template

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ut directory

Instead of relying on each toolset to set base64_uri individually, add a
chatroom-level pre/post snapshot that catches images saved to a designated
.pantheon/images/ directory. Agents are instructed via system prompt to save
all generated images there, keeping the scan cheap (single flat directory).

- Add shared image_detection utility (snapshot, diff, encode)
- Chatroom.chat() creates .pantheon/images/, snapshots before/after execution,
  and emits synthetic step messages for newly detected images
- Shell toolset: remove Python-only guard so all commands trigger detection,
  refactored to use shared utility
- Claw runtime: deduplicate images in make_image_step_callback
- Agent: inject <image_output_constraint> when image_output_dir is set

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The chatroom-level .pantheon/images/ catch-all handles universal image
detection. Shell-level snapshot should stay targeted to Python commands
only to avoid unnecessary I/O on every shell invocation.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Downloaded images from Slack were passed as raw bytes to OpenAI without
format normalization, causing "unsupported image" errors. Now all Slack
images are re-encoded through PIL to guaranteed PNG/JPEG output. Also
improved error logging for image upload failures.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add shared md_to_slack, md_to_telegram (MarkdownV2), and md_to_plain
converters in runtime.py. Apply per-channel: Slack gets mrkdwn, Telegram
gets MarkdownV2 with proper escaping and fallback, plain-text channels
(Feishu, WeChat, QQ, iMessage) get stripped markdown.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
30 new tests covering md_to_slack, md_to_telegram (MarkdownV2), and
md_to_plain: bold/italic/strikethrough conversion, link handling,
header rendering, list formatting, code block preservation, special
char escaping, edge cases (empty strings, plain text), and full
message round-trips.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants