Conversation
Signed-off-by: Noncrab <git@noncrab.net>
📝 WalkthroughWalkthrough
Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
🧹 Nitpick comments (1)
src/process_collector.rs (1)
167-172: Race condition handling is reasonable, but double-counting remains possible.The
.max(0.0)correctly prevents negative increments when another thread has already updated the counter past the currenttotal. However, the TOCTOU gap betweenget()(line 167) andinc_by()(line 171) means concurrent collectors could both read the samepastvalue and each add the full delta, causing double-counting.Given the PR author's analysis that gauges would be a regression for some users, this trade-off seems acceptable. The window for this race is small, and the metric will self-correct on subsequent collections.
Consider clarifying the comment to note both failure modes (underflow prevention and potential double-counting):
📝 Suggested comment clarification
let past = self.cpu_total.get(); - // If two threads are collecting metrics at the same time, - // the cpu_total counter may have already been updated, - // and the subtraction may underflow. + // If two threads collect concurrently, one may update cpu_total + // before the other calls inc_by. Use max(0.0) to avoid incrementing + // by a negative value if `past` was already advanced beyond `total`. + // Note: concurrent collection may still double-count briefly. self.cpu_total.inc_by((total - past).max(0.0));🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/process_collector.rs` around lines 167 - 172, Update the inline comment around the cpu_total read/update (the call sites cpu_total.get(), cpu_total.inc_by((total - past).max(0.0)), and cpu_total.collect()) to explicitly state that the current defensive .max(0.0) prevents underflow if another thread has advanced the counter, but that the TOCTOU window between reading past and calling inc_by can still allow two collectors to read the same past and both add the delta (causing transient double-counting), and that this is a small, self-correcting race accepted over switching to gauges; keep the wording concise and reference the variables past and total so future readers understand the two failure modes.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@src/process_collector.rs`:
- Around line 167-172: Update the inline comment around the cpu_total
read/update (the call sites cpu_total.get(), cpu_total.inc_by((total -
past).max(0.0)), and cpu_total.collect()) to explicitly state that the current
defensive .max(0.0) prevents underflow if another thread has advanced the
counter, but that the TOCTOU window between reading past and calling inc_by can
still allow two collectors to read the same past and both add the delta (causing
transient double-counting), and that this is a small, self-correcting race
accepted over switching to gauges; keep the wording concise and reference the
variables past and total so future readers understand the two failure modes.
Fixes #560.
The one thing i'm not sure about is trading the well-defined wrap-around semantics for the risks of running out of bits in the mantissa. But then again, this uses an AtomicF64 under the hood, whose mantissa is 53 bits wide (can accurately hold an integer of up to ~9e+15).
Practically, by my (hand-wavey) reckoning, that'll mean we'll not be able to accurately represent times at ~1µs of precision after ~285 CPU millenia. I'm not sure how much of an issue that might be, or how best to record that concern.
I'd also be tempted to just use a gauge here as that'll avoid the need for calculating the delta since the last update (and the ensuing race condition) but that's bound to be a regression for someone.
Summary by CodeRabbit