Use floats for tracking cpu time by noncrab · Pull Request #564 · tikv/rust-prometheus

noncrab · 2026-03-28T15:21:06Z

Fixes #560.

The one thing i'm not sure about is trading the well-defined wrap-around semantics for the risks of running out of bits in the mantissa. But then again, this uses an AtomicF64 under the hood, whose mantissa is 53 bits wide (can accurately hold an integer of up to ~9e+15).

Practically, by my (hand-wavey) reckoning, that'll mean we'll not be able to accurately represent times at ~1µs of precision after ~285 CPU millenia. I'm not sure how much of an issue that might be, or how best to record that concern.

I'd also be tempted to just use a gauge here as that'll avoid the need for calculating the delta since the last update (and the ensuing race condition) but that's bound to be a regression for someone.

Summary by CodeRabbit

Bug Fixes
- Improved accuracy of CPU metrics through higher-precision calculations and enhanced handling of edge cases in metric computation.

Signed-off-by: Noncrab <git@noncrab.net>

coderabbitai · 2026-03-28T15:21:19Z

📝 Walkthrough

Walkthrough

ProcessCollector's cpu_total metric type is changed from IntCounter to Counter, with corresponding updates to the constructor and CPU time computation logic. The metric now uses floating-point division instead of integer division to properly track CPU seconds.

Changes

Cohort / File(s)	Summary
CPU Metric Type Fix `src/process_collector.rs`	Changed `cpu_total` field from `IntCounter` to `Counter` and updated construction method from `IntCounter::with_opts` to `Counter::with_opts`. Updated CPU time computation from integer division to floating-point division, replacing `saturating_sub(past)` with `(total - past).max(0.0)` to properly handle underflow in float context.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Poem

🐰 A counter's tale, once told in whole,
Now flows as fragments, soft and small,
From IntCounter's rigid role,
To Counter's grace—the fix for all!
✨ Prometheus smiles at metrics true,
When floats dance freely, clear and new.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title directly captures the main change: switching CPU time tracking from integer to floating-point values.
Linked Issues check	✅ Passed	The PR successfully converts process_cpu_seconds_total from IntCounter to Counter, addressing issue `#560`'s requirement for float-based CPU time reporting.
Out of Scope Changes check	✅ Passed	All changes are directly related to converting CPU time tracking from integer to floating-point in ProcessCollector, with no unrelated modifications.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

🧹 Nitpick comments (1)

src/process_collector.rs (1)
167-172: Race condition handling is reasonable, but double-counting remains possible.

The .max(0.0) correctly prevents negative increments when another thread has already updated the counter past the current total. However, the TOCTOU gap between get() (line 167) and inc_by() (line 171) means concurrent collectors could both read the same past value and each add the full delta, causing double-counting.

Given the PR author's analysis that gauges would be a regression for some users, this trade-off seems acceptable. The window for this race is small, and the metric will self-correct on subsequent collections.

Consider clarifying the comment to note both failure modes (underflow prevention and potential double-counting):
📝 Suggested comment clarification
             let past = self.cpu_total.get();
-            // If two threads are collecting metrics at the same time,
-            // the cpu_total counter may have already been updated,
-            // and the subtraction may underflow.
+            // If two threads collect concurrently, one may update cpu_total
+            // before the other calls inc_by. Use max(0.0) to avoid incrementing
+            // by a negative value if `past` was already advanced beyond `total`.
+            // Note: concurrent collection may still double-count briefly.
             self.cpu_total.inc_by((total - past).max(0.0));
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/process_collector.rs` around lines 167 - 172, Update the inline comment
around the cpu_total read/update (the call sites cpu_total.get(),
cpu_total.inc_by((total - past).max(0.0)), and cpu_total.collect()) to
explicitly state that the current defensive .max(0.0) prevents underflow if
another thread has advanced the counter, but that the TOCTOU window between
reading past and calling inc_by can still allow two collectors to read the same
past and both add the delta (causing transient double-counting), and that this
is a small, self-correcting race accepted over switching to gauges; keep the
wording concise and reference the variables past and total so future readers
understand the two failure modes.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@src/process_collector.rs`:
- Around line 167-172: Update the inline comment around the cpu_total
read/update (the call sites cpu_total.get(), cpu_total.inc_by((total -
past).max(0.0)), and cpu_total.collect()) to explicitly state that the current
defensive .max(0.0) prevents underflow if another thread has advanced the
counter, but that the TOCTOU window between reading past and calling inc_by can
still allow two collectors to read the same past and both add the delta (causing
transient double-counting), and that this is a small, self-correcting race
accepted over switching to gauges; keep the wording concise and reference the
variables past and total so future readers understand the two failure modes.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: dfce4370-bc87-4d83-acd9-4603562dfc95

📥 Commits

Reviewing files that changed from the base of the PR and between 8151418 and 58a78e9.

📒 Files selected for processing (1)

src/process_collector.rs

Use a floats for tracking cpu time.

58a78e9

Signed-off-by: Noncrab <git@noncrab.net>

ti-chi-bot bot added dco-signoff: yes contribution labels Mar 28, 2026

coderabbitai bot reviewed Mar 28, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use floats for tracking cpu time#564

Use floats for tracking cpu time#564
noncrab wants to merge 1 commit intotikv:masterfrom
noncrab:issue-560-fractional-cpu-seconds

noncrab commented Mar 28, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Mar 28, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai bot left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

noncrab commented Mar 28, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Mar 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

noncrab commented Mar 28, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Mar 28, 2026 •

edited

Loading