Skip to content

vmm/task: fix stdout stream handoff and cleanup#229

Open
novahe wants to merge 1 commit intokuasar-io:mainfrom
novahe:fix-preemption-out
Open

vmm/task: fix stdout stream handoff and cleanup#229
novahe wants to merge 1 commit intokuasar-io:mainfrom
novahe:fix-preemption-out

Conversation

@novahe
Copy link
Copy Markdown
Contributor

@novahe novahe commented Mar 18, 2026

Background

This change is a relay follow-up to #202.

Observed issue: crictl exec -it xx ls is a short-lived command, and it would occasionally return with no output.

Summary

This change simplifies and hardens the vmm-task streaming handoff path for stdout/stderr, and tightens stdin read behavior for partial-buffer reads.

The main fixes are:

  • handle output client disconnects while a stream handoff is in progress
  • keep the preemption handoff path from stalling after the output sender has already closed
  • remove closed output channels only after they are fully drained
  • preserve unread stdin bytes across partial AsyncRead calls
  • rename the public helper from close_stdout to close_output to match its actual use for both stdout and stderr
 flowchart TD
     S1[Created] -->|get_output| S2[Producer Attached]
    S1 -.->|preempt_receiver edge case| S3[Consumer Attached]
     S2 -->|preempt_receiver| S3

     S3 -->|stream_sender.send fails| S4[Buffered For Reattach]
     S3 -->|marshal fails, receiver returned| S2
     S4 -->|next handle_stdout flushes remaining_data first| S3

     S3 -->|new consumer arrives, preempt requested| S5[Preempt Requested]
     S5 -->|return_preempted_receiver remaining_data=None| S2
     S5 -->|return_preempted_receiver remaining_data=Some| S4

     S2 -->|close_output_channel sets sender_closed=true| S6[Producer Closed - Draining]
     S3 -->|close_output_channel sets sender_closed=true| S6
     S4 -->|close_output_channel sets sender_closed=true| S6

     S3 -->|client closes stream, remaining_data=None| S2
     S3 -->|receiver.recv returns empty vec| S7[Removed]
     S3 -->|receiver.recv returns None| S7

     S6 -->|sender_closed && remaining_data=None && receiver.is_closed && receiver.is_empty| S7
Loading

Problem

The previous streaming flow had a few edge cases that made handoff and cleanup fragile:

  1. handle_stdout only waited on the internal output receiver.
    If the client closed the stream first, the server could stay blocked waiting for output instead of returning the preempted receiver and finishing cleanup.

  2. Closed output channels were not tracked explicitly.
    This made it hard to distinguish between "sender is gone but buffered state still needs to be handed off" and "the channel is fully drained and can be safely removed".

  3. Receiver return and channel cleanup were too loosely coupled.
    In sender-closed scenarios, a waiting preemptor could be left waiting even though the receiver had already been returned.

  4. StreamingStdin assumed each received chunk could be consumed in one read.
    When the caller's read buffer was smaller than the queued chunk, unread bytes were not preserved explicitly for the next read.

Fix

Output handoff and cleanup

  • add a sender_closed flag to IOChannel
  • add should_remove_closed_channel() so channel removal happens only when:
    • the sender is closed
    • there is no remaining_data
    • the receiver is closed and fully drained
  • update return_preempted_receiver() to:
    • restore the receiver
    • decide whether the channel can now be removed
    • still notify any waiting preemptor
  • add close_output_channel() and expose it through close_output()

Output stream lifecycle

  • change handle_stdout() to split the ttrpc stream and select! over:
    • internal output data
    • client-side stream closure / receive errors
  • when the client disconnects, return the preempted receiver immediately instead of waiting for more output
  • remove the IO channel explicitly once the output stream is fully finished

Stdin partial-read handling

  • extend StreamingStdin with a leftover buffer
  • if one queued stdin chunk is larger than the current read buffer:
    • return only the part that fits
    • keep the remaining bytes for the next read
  • preserve correct behavior for:
    • overlapping reads
    • leftover-then-EOF
    • zero-length reads

Error cleanup

  • make stdin-side failure paths remove the channel consistently instead of leaving partially reusable state behind

Why this is better

This keeps the existing preemption-based design, but makes the state transitions explicit:

  • client disconnect no longer leaves the handoff path hanging
  • closed channels are not removed too early
  • drained channels are not kept forever
  • stdin reads now behave like a normal buffered AsyncRead

Notes

This PR keeps the current handoff model and focuses on correctness and cleanup behavior. It does not redesign the streaming architecture or change the external streaming protocol.

@novahe novahe requested a review from a team as a code owner March 18, 2026 16:39
Copilot AI review requested due to automatic review settings March 18, 2026 16:39
@gemini-code-assist
Copy link
Copy Markdown
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request refines the streaming I/O mechanism within the VMM task component to address an observed issue where short-lived crictl exec commands would sometimes produce no output. The changes focus on robustly managing the lifecycle of stdout streams, ensuring that data is fully delivered and resources are properly released, even when clients disconnect or senders close their channels prematurely. This enhances the reliability of interactive command execution within the virtual machine environment.

Highlights

  • Stdout Stream Handoff: Improved handling of stdout client closure while forwarding stream data to prevent data loss or incomplete output.
  • Preemption Handoff Wakeup: Ensured the preemption handoff wakeup path remains functional even after the sender closes its side of the channel.
  • Channel Cleanup: Implemented logic to clean up closed output channels only after all remaining data has been fully drained, preventing premature removal.
  • Regression Tests: Added new regression tests to validate the correct behavior of sender-closed handoff and drained channel cleanup.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request effectively addresses the issue of occasional missing output from short-lived commands by improving the handling of stdout stream handoff and cleanup. The changes, including deferring channel cleanup until fully drained, handling client-side stream closure gracefully, and ensuring the preemption path works correctly even after a sender is closed, are well-implemented. The addition of regression tests for these scenarios is also a great improvement. I have one suggestion regarding code repetition in handle_stdout for better maintainability, but overall, this is a solid fix.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR addresses intermittent missing output for short-lived crictl exec -it ... by improving the stdout/stderr streaming handoff behavior during receiver preemption and by deferring output-channel cleanup until the channel is fully drained.

Changes:

  • Track sender_closed per IOChannel and only remove closed output channels once the receiver buffer and any remaining_data are drained.
  • Update stdout/stderr streaming to detect client-initiated close while forwarding output, and keep receiver handoff functional after sender close.
  • Add regression tests for sender-closed handoff and drained-channel cleanup; update IO copy teardown to call the new output-close path.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

File Description
vmm/task/src/streaming.rs Adds sender-closed tracking, delayed cleanup logic, client-close detection in stdout forwarding, and new regression tests.
vmm/task/src/io.rs Switches streaming output teardown from remove_channel to close_stdout after container-to-client copy completes.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@novahe novahe force-pushed the fix-preemption-out branch from 8af8d4d to 197c135 Compare March 19, 2026 08:14
@novahe novahe requested a review from Copilot March 19, 2026 08:20
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes intermittent missing output for short-lived crictl exec -it commands by making stdout forwarding resilient to client-close, preserving preemption handoff behavior after sender close, and deferring IO channel cleanup until an output channel is fully drained.

Changes:

  • Detect stdout client-close while forwarding and exit cleanly without dropping output-hand-off state.
  • Introduce sender_closed + drained checks to decide when an output channel can be safely removed.
  • Add regression tests covering sender-closed handoff and drained-channel cleanup.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.

File Description
vmm/task/src/streaming.rs Adds sender-closed/drain-aware cleanup, handles stdout stream client-close via stream split + select, and adds regression tests.
vmm/task/src/io.rs Switches container→streaming close behavior from hard removal to sender-closed signaling (close_output).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@@ -80,6 +80,7 @@ pub struct IOChannel {
remaining_data: Option<Any>,
preemption_sender: Option<Sender<()>>,
notifier: Arc<Notify>,
Comment on lines +291 to +302
async fn close_output_channel(&self, id: &str) {
let mut ios = self.ios.lock().await;
let remove_channel = if let Some(ch) = ios.get_mut(id) {
ch.sender_closed = true;
ch.should_remove_closed_channel()
} else {
false
};
if remove_channel {
ios.remove(id);
}
}
}

let return_result = timeout(
Duration::from_millis(200),
@novahe novahe force-pushed the fix-preemption-out branch 2 times, most recently from 40947f5 to 16b89b6 Compare March 19, 2026 11:13
Handle output client close while forwarding stream data, keep the
preemption handoff wakeup path working after sender close, and clean up
closed output channels only after they are fully drained.

Rename the public streaming close helper from close_stdout to
close_output so the cleanup API matches the fact that the path is used
for both stdout and stderr streaming URLs.

Add regression tests for sender-closed handoff and drained channel
cleanup.

Co-authored-by: novahe <heqianfly@gmail.com>
Signed-off-by: novahe <heqianfly@gmail.com>
@novahe novahe force-pushed the fix-preemption-out branch from 8e1104a to 44e9b92 Compare March 22, 2026 01:53
@kevin-wangzefeng
Copy link
Copy Markdown
Member

@novahe The PR needs rebase, could you update it?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants