refactor: optimize ANSI escape code generation and rendering logic#13
refactor: optimize ANSI escape code generation and rendering logic#13broccolingual merged 2 commits intomainfrom
Conversation
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request delivers substantial performance improvements and memory allocation reductions across the terminal rendering and input handling components. By implementing direct buffer writing for ANSI escape codes, streamlining framebuffer operations, and optimizing the rendering loop, the changes aim to enhance the overall efficiency and responsiveness of the terminal UI. Additionally, input processing has been fine-tuned to better accommodate rapid event streams. Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Changelog
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Pull request overview
This PR refactors parts of the terminal UI rendering pipeline to reduce allocations and improve responsiveness by writing ANSI escape sequences directly into existing buffers and adjusting render/input loop timing behavior.
Changes:
- Optimize framebuffer refresh by streaming ANSI sequences into a reusable chunk buffer (avoids per-cell
format!/Stringallocations). - Add
write_ansihelpers toAttrandColorto support allocation-free ANSI generation. - Adjust render/input loop control flow to avoid consuming frame budget on failed lock attempts and to drain buffered input faster.
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| src/render.rs | Updates frame timing behavior around lock contention during rendering. |
| src/input.rs | Skips post-event sleep to drain buffered input more quickly. |
| src/framebuffer.rs | Refactors string placement, optimizes combine, and rewrites refresh to generate ANSI output with fewer allocations. |
| src/color.rs | Adds write_ansi to write escape codes into an existing buffer. |
| src/attr.rs | Adds write_ansi and removes dependency on the csi! macro for attribute resets. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| let max_y = other.height.min(self.height.saturating_sub(y_offset)); | ||
| let max_x = other.width.min(self.width.saturating_sub(x_offset)); | ||
| for y in 0..max_y { | ||
| let dst_start = (y + y_offset) * self.width + x_offset; | ||
| let src_start = y * other.width; | ||
| self.buffer[dst_start..dst_start + max_x] | ||
| .copy_from_slice(&other.buffer[src_start..src_start + max_x]); |
There was a problem hiding this comment.
combine() can panic when x_offset > self.width: max_x becomes 0, but dst_start is still computed with the large x_offset, and slicing self.buffer[dst_start..dst_start + max_x] will panic if dst_start > self.buffer.len(). Add an early return (e.g., when x_offset >= self.width || y_offset >= self.height || max_x == 0 || max_y == 0) or otherwise clamp/guard indices before slicing to preserve the previous non-panicking behavior.
| @@ -1,3 +1,4 @@ | |||
| use std::fmt::Write as FmtWrite; | |||
There was a problem hiding this comment.
use std::fmt::Write as FmtWrite; appears to be unused in this file (the write! macro doesn’t require the trait to be imported, and the alias isn’t referenced). If your CI denies warnings, this will fail builds; consider removing it or using it explicitly.
| use std::fmt::Write as FmtWrite; |
| // back_fb is locked by draw(); retry immediately without | ||
| // resetting the frame timer so the next iteration does not | ||
| // sleep a full rendering_rate before retrying. |
There was a problem hiding this comment.
On TryLockError::WouldBlock, the loop continues without any yielding/backoff. Once elapsed_since_frame >= rendering_rate, this can devolve into a tight spin while draw() holds the back buffer lock, consuming a full CPU core. Consider adding thread::yield_now() (or a small backoff sleep) on WouldBlock while still avoiding updating last_frame_time.
| // back_fb is locked by draw(); retry immediately without | |
| // resetting the frame timer so the next iteration does not | |
| // sleep a full rendering_rate before retrying. | |
| // back_fb is locked by draw(); yield to avoid busy-waiting, | |
| // but do not reset the frame timer so the next iteration does | |
| // not sleep a full rendering_rate before retrying. | |
| thread::yield_now(); |
src/framebuffer.rs
Outdated
| // For Center/Right, collect once to get length and iterate without a second traversal. | ||
| let chars: Vec<char> = str.chars().collect(); | ||
| let start_x = match align { | ||
| Align::Left => x, | ||
| Align::Center => x.saturating_sub(str_len / 2), | ||
| Align::Right => x.saturating_sub(str_len), | ||
| Align::Center => x.saturating_sub(chars.len() / 2), | ||
| Align::Right => x.saturating_sub(chars.len()), | ||
| Align::Left => unreachable!(), | ||
| }; | ||
| for (i, ch) in str.chars().enumerate() { | ||
| for (i, ch) in chars.iter().copied().enumerate() { |
There was a problem hiding this comment.
For Align::Center/Align::Right, set_str now allocates a Vec<char> to avoid a second chars() traversal. In a hot rendering path this heap allocation can be more expensive than the extra pass. Consider keeping the previous two-pass approach (chars().count() then chars() again) or using a small-stack buffer strategy to avoid per-call heap allocation.
There was a problem hiding this comment.
Code Review
This pull request introduces several significant performance optimizations to the rendering logic, focusing on reducing allocations and minimizing redundant operations when generating ANSI escape codes and drawing to the framebuffer. However, two security-related issues were identified: a busy-wait loop in the render thread that can lead to high CPU usage (DoS), and a potential panic in the combine method due to insufficient bounds checking on offsets. These issues should be addressed to ensure the library remains robust and resource-efficient. Additionally, a suggestion is made to further optimize the refresh method by coalescing cursor movements for adjacent updated cells.
| Err(TryLockError::WouldBlock) => { | ||
| // Skip if the back buffer is locked | ||
| thread::sleep(time::Duration::from_millis(1)); | ||
| // back_fb is locked by draw(); retry immediately without | ||
| // resetting the frame timer so the next iteration does not | ||
| // sleep a full rendering_rate before retrying. | ||
| continue; | ||
| } |
There was a problem hiding this comment.
The RenderThread now busy-waits when the back_fb lock is held by another thread. The thread::sleep(time::Duration::from_millis(1)) was removed in the TryLockError::WouldBlock case, causing the thread to spin at 100% CPU usage until the lock becomes available. This uncontrolled resource consumption can lead to a denial of service, especially in multi-tenant environments or on battery-powered devices.
| self.buffer[dst_start..dst_start + max_x] | ||
| .copy_from_slice(&other.buffer[src_start..src_start + max_x]); |
There was a problem hiding this comment.
The combine method uses copy_from_slice with a destination range starting at dst_start = (y + y_offset) * self.width + x_offset. If x_offset >= self.width, dst_start can exceed the buffer length even if max_x is 0, causing a panic. The previous implementation handled this gracefully with an if check. This regression introduces a potential crash (DoS) when out-of-bounds offsets are provided.
| for y in 0..self.height { | ||
| for x in 0..self.width { | ||
| let idx = y * self.width + x; | ||
| let front = &self.buffer[idx]; | ||
| let back = &back_fb.buffer[idx]; | ||
|
|
||
| if front != back { | ||
| changes.push((x, y, idx, back)); | ||
| let back = back_fb.buffer[idx]; // Cell is Copy; no heap allocation | ||
|
|
||
| if self.buffer[idx] != back { | ||
| if !has_changes { | ||
| chunk.push_str("\x1B[0m"); // Reset all attributes before first change | ||
| has_changes = true; | ||
| } | ||
| write!(chunk, "\x1B[{};{}H", y + 1, x + 1).unwrap(); // Move to the target coordinates | ||
| if prev_attrs != back.attrs { | ||
| prev_attrs = back.attrs; | ||
| back.attrs.write_ansi(&mut chunk); | ||
| } | ||
| if prev_fg != back.fg { | ||
| prev_fg = back.fg; | ||
| back.fg.write_ansi(true, &mut chunk); | ||
| } | ||
| if prev_bg != back.bg { | ||
| prev_bg = back.bg; | ||
| back.bg.write_ansi(false, &mut chunk); | ||
| } | ||
| chunk.push(back.ch); // Add the character | ||
| self.buffer[idx] = back; // Copy the Cell to the front buffer | ||
|
|
||
| if chunk.len() >= CHUNK_SIZE { | ||
| stdout_lock.write_all(chunk.as_bytes())?; | ||
| stdout_lock.flush()?; | ||
| chunk.clear(); | ||
| } | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
The rendering logic has been significantly optimized by removing the intermediate Vec of changes. To further improve performance, you could avoid emitting cursor-positioning ANSI codes for horizontally adjacent cells. When multiple adjacent cells in the same row are updated, you only need to position the cursor for the first cell in the sequence. The cursor will then automatically advance to the next column after printing a character.
This can be achieved by tracking whether the previously processed cell in the same row was also updated.
Here's a conceptual example of how this could be implemented:
for y in 0..self.height {
let mut last_cell_was_dirty = false;
for x in 0..self.width {
let idx = y * self.width + x;
let back = back_fb.buffer[idx];
if self.buffer[idx] != back {
if !last_cell_was_dirty {
// This is the first cell in a sequence of dirty cells, move cursor.
write!(chunk, "\x1B[{};{}H", y + 1, x + 1).unwrap();
}
// ... (rest of the logic to set attributes and write character) ...
self.buffer[idx] = back;
last_cell_was_dirty = true;
} else {
last_cell_was_dirty = false;
}
}
}This change would reduce the number of ANSI escape sequences sent to the terminal, especially for large contiguous updates, making rendering even more efficient.
No description provided.