Skip to content

Feature: improve long-task feedback and output visibility #7

@ttiee

Description

@ttiee

Feature Request: Improve Long-Task Feedback and Output Visibility

Summary

Improve Telegram feedback for long-running Codex tasks so users can tell that work is progressing, understand when output is truncated, and get more actionable failure summaries.

Problem

The current bridge already streams progress and final output, but long tasks still feel opaque on Telegram:

  • Streamed content is updated in-place and effectively shows only the latest visible chunk.
  • Users are not clearly told when the live stream is truncated due to Telegram message size limits.
  • Progress text does not prominently communicate elapsed time or task stage from a user perspective.
  • Failure cases can end with limited context, which forces users to ask follow-up questions just to learn what went wrong.

This is especially noticeable for code generation, test runs, or repository analysis tasks that take long enough for users to wonder whether the bot is stuck.

Reproduction

  1. Run a prompt that produces sustained streaming output or a long tool-based workflow.
  2. Observe the progress message and the final message sequence in Telegram.
  3. Compare what the user can infer during execution versus what the service actually tracks internally.

Expected Behavior

During long tasks, the user should be able to tell:

  • that the job is still active
  • roughly how long it has been running
  • whether the visible live output is only a partial window
  • what the most recent meaningful update is
  • what failed, if the run ends unsuccessfully

Actual Behavior

The current feedback is functional but not explicit enough for long-running jobs, especially when the stream exceeds one Telegram chunk.

Proposal

Improve the runtime feedback contract with these changes:

  • Add elapsed-time information to progress updates.
  • Make stream truncation explicit when only the latest chunk is visible.
  • Preserve a clearer distinction between:
    • progress summary
    • live output preview
    • final result
    • failure diagnostics
  • On non-zero exit, append a compact diagnostic summary that is optimized for immediate user action.
  • Optionally include a short completion footer such as duration and whether the final response came from full output or fallback diagnostics.

Alternatives Considered

  • Keep only the current latest-chunk live preview.
    • Minimal effort, but users still cannot tell whether content was truncated or stalled.
  • Send every stream chunk as a new Telegram message.
    • More visible, but likely too noisy for active chats.

Scope and Constraints

  • Do not flood the chat with excessive progress messages.
  • Preserve the current chunking limits and Telegram-safe rendering behavior.
  • Keep the behavior compatible with both resume and exec modes.
  • Prefer incremental improvements over introducing external log storage.

Affected Areas

  • src/nonebot_plugin_codex/telegram.py
  • src/nonebot_plugin_codex/service.py
  • tests/test_service.py
  • tests/test_telegram_handlers.py

Verification Plan

  • Add regression tests for long stream preview behavior and truncation indication.
  • Add tests for failure summaries and completion metadata.
  • Run:
    • pdm run pytest -q
    • pdm run ruff check .

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions