feat: [metrics] Create metrics framework and add connection metrics#407
feat: [metrics] Create metrics framework and add connection metrics#407charlesdong1991 wants to merge 15 commits intoapache:mainfrom
Conversation
fresh-borzoni
left a comment
There was a problem hiding this comment.
@charlesdong1991 Ty for the PR, left some comments
PTAL
|
Thanks for reviews, i made some changes and leave a question. PTAL @fresh-borzoni |
There was a problem hiding this comment.
@charlesdong1991 LGTM overall, small comment about misleading doc, and we can do simple math to match Java here, at least semantically. PTAL
btw, just to clarify before making changes, do you mean to update code to fix header overhead? or to update the doc to reflect this? @fresh-borzoni |
Sorry, I should have been more specific - I meant code @charlesdong1991 |
|
Gotcha, thanks @fresh-borzoni will do a quick change, should be straightforward 👍 |
|
i updated code to calculate total size "manually", PTAL @fresh-borzoni thanks again! |
|
@charlesdong1991 TY, LGTM |
|
would be great if you can take a look! Thanks! @luoyuxia |
There was a problem hiding this comment.
Pull request overview
Adds a client-side metrics framework (via the metrics facade) and instruments RPC connection requests to emit per-API-key counters, gauges, and latency histograms (issue #390).
Changes:
- Introduces
crates/fluss/src/metrics.rswith metric name constants and anapi_keylabel filter for reportable APIs. - Instruments
ServerConnectionInner::requestto record request/response counts, bytes sent/received, in-flight gauge, and request latency (with drop-based cleanup on cancellation). - Adds unit/integration-style tests around connection metrics emission and failure paths; wires in
metrics/metrics-utildependencies.
Reviewed changes
Copilot reviewed 7 out of 7 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| crates/fluss/src/rpc/server_connection.rs | Emits connection-level metrics around request lifecycle; adds tests validating metrics behavior (success, send failure, API error). |
| crates/fluss/src/rpc/mod.rs | Re-exports ApiKey as pub(crate) to support metrics helpers. |
| crates/fluss/src/rpc/message/header.rs | Exposes REQUEST_HEADER_LENGTH for consistent byte accounting. |
| crates/fluss/src/metrics.rs | New metrics constants + api_key label helper and tests. |
| crates/fluss/src/lib.rs | Exposes metrics module from the crate root. |
| crates/fluss/Cargo.toml | Adds metrics dependency and metrics-util for dev/test support. |
| Cargo.toml | Adds workspace-level metrics dependency version. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 7 out of 7 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
can i get another round of reviews? @luoyuxia @fresh-borzoni thanks! |
fresh-borzoni
left a comment
There was a problem hiding this comment.
@charlesdong1991 Looked through one more time, LGTM
|
Just rebased, it would be great if you can take a look as well! @luoyuxia @leekeiabstraction 🙏 Thank you! |
leekeiabstraction
left a comment
There was a problem hiding this comment.
TY for the PR! Left a couple of small comments
crates/fluss/src/metrics.rs
Outdated
| assert_eq!(find_counter(CLIENT_RESPONSES_TOTAL), Some(1)); | ||
| assert_eq!(find_counter(CLIENT_BYTES_SENT_TOTAL), Some(256)); | ||
| assert_eq!(find_counter(CLIENT_BYTES_RECEIVED_TOTAL), Some(128)); | ||
| assert_eq!(find_histogram(CLIENT_REQUEST_LATENCY_MS), Some(vec![42.5])); |
There was a problem hiding this comment.
Seems like we're missing a couple of assertions on gauge metrics?
There was a problem hiding this comment.
There is only 1 gauge metric now, and have tested in inflight_gauge_nets_to_zero_after_balanced_calls, but i added also here for completeness
Thanks!
|
Thanks for the callout on tests @leekeiabstraction Addressed both! PTAL! cc @luoyuxia |
leekeiabstraction
left a comment
There was a problem hiding this comment.
TY for the revision. Added a couple more comments.
(Sorry for reviewing in waves, don't have my computer with me atm so it's easy to miss things reviewing on the phone).
|
thanks for suggestions! @leekeiabstraction updated |
leekeiabstraction
left a comment
There was a problem hiding this comment.
Approved. TY for your contribution!
Purpose
Linked issue: close #390
Brief change log
This PR adds a metrics framework and connection level metrics.
NOTE:
There is an parity from Java, which is the scopeguard for in-flight on cancellation, because Java callback model doesn't have equivalent to tokio future cancellation. And if a tokio future is dropped mid-execution, it would skip decrement and cause gauge drift. So i think it's important to add "scopeguard" to avoid that.
And I can have writer and scanner metrics as follow-up PRs.
Tests
All tests are passed locally.