Skip to content

feat: [metrics] Create metrics framework and add connection metrics#407

Open
charlesdong1991 wants to merge 15 commits intoapache:mainfrom
charlesdong1991:metrics-framework
Open

feat: [metrics] Create metrics framework and add connection metrics#407
charlesdong1991 wants to merge 15 commits intoapache:mainfrom
charlesdong1991:metrics-framework

Conversation

@charlesdong1991
Copy link
Contributor

Purpose

Linked issue: close #390

Brief change log

This PR adds a metrics framework and connection level metrics.

NOTE:
There is an parity from Java, which is the scopeguard for in-flight on cancellation, because Java callback model doesn't have equivalent to tokio future cancellation. And if a tokio future is dropped mid-execution, it would skip decrement and cause gauge drift. So i think it's important to add "scopeguard" to avoid that.

And I can have writer and scanner metrics as follow-up PRs.

Tests

All tests are passed locally.

Copy link
Contributor

@fresh-borzoni fresh-borzoni left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@charlesdong1991 Ty for the PR, left some comments
PTAL

@charlesdong1991
Copy link
Contributor Author

Thanks for reviews, i made some changes and leave a question. PTAL @fresh-borzoni

Copy link
Contributor

@fresh-borzoni fresh-borzoni left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@charlesdong1991 LGTM overall, small comment about misleading doc, and we can do simple math to match Java here, at least semantically. PTAL

@charlesdong1991
Copy link
Contributor Author

and we can do simple math to match Java here, at least semantically. PTAL

btw, just to clarify before making changes, do you mean to update code to fix header overhead? or to update the doc to reflect this? @fresh-borzoni

@fresh-borzoni
Copy link
Contributor

fresh-borzoni commented Mar 5, 2026

and we can do simple math to match Java here, at least semantically. PTAL

btw, just to clarify before making changes, do you mean to update code to fix header overhead? or to update the doc to reflect this? @fresh-borzoni

Sorry, I should have been more specific - I meant code @charlesdong1991

@charlesdong1991
Copy link
Contributor Author

Gotcha, thanks @fresh-borzoni will do a quick change, should be straightforward 👍

@charlesdong1991
Copy link
Contributor Author

i updated code to calculate total size "manually", PTAL @fresh-borzoni

thanks again!

@fresh-borzoni
Copy link
Contributor

@charlesdong1991 TY, LGTM

@charlesdong1991
Copy link
Contributor Author

would be great if you can take a look! Thanks! @luoyuxia

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a client-side metrics framework (via the metrics facade) and instruments RPC connection requests to emit per-API-key counters, gauges, and latency histograms (issue #390).

Changes:

  • Introduces crates/fluss/src/metrics.rs with metric name constants and an api_key label filter for reportable APIs.
  • Instruments ServerConnectionInner::request to record request/response counts, bytes sent/received, in-flight gauge, and request latency (with drop-based cleanup on cancellation).
  • Adds unit/integration-style tests around connection metrics emission and failure paths; wires in metrics/metrics-util dependencies.

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
crates/fluss/src/rpc/server_connection.rs Emits connection-level metrics around request lifecycle; adds tests validating metrics behavior (success, send failure, API error).
crates/fluss/src/rpc/mod.rs Re-exports ApiKey as pub(crate) to support metrics helpers.
crates/fluss/src/rpc/message/header.rs Exposes REQUEST_HEADER_LENGTH for consistent byte accounting.
crates/fluss/src/metrics.rs New metrics constants + api_key label helper and tests.
crates/fluss/src/lib.rs Exposes metrics module from the crate root.
crates/fluss/Cargo.toml Adds metrics dependency and metrics-util for dev/test support.
Cargo.toml Adds workspace-level metrics dependency version.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 7 out of 7 changed files in this pull request and generated 3 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@charlesdong1991
Copy link
Contributor Author

can i get another round of reviews? @luoyuxia @fresh-borzoni thanks!

Copy link
Contributor

@fresh-borzoni fresh-borzoni left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@charlesdong1991 Looked through one more time, LGTM

@charlesdong1991
Copy link
Contributor Author

Just rebased, it would be great if you can take a look as well! @luoyuxia @leekeiabstraction 🙏

Thank you!

Copy link
Contributor

@leekeiabstraction leekeiabstraction left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TY for the PR! Left a couple of small comments

assert_eq!(find_counter(CLIENT_RESPONSES_TOTAL), Some(1));
assert_eq!(find_counter(CLIENT_BYTES_SENT_TOTAL), Some(256));
assert_eq!(find_counter(CLIENT_BYTES_RECEIVED_TOTAL), Some(128));
assert_eq!(find_histogram(CLIENT_REQUEST_LATENCY_MS), Some(vec![42.5]));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like we're missing a couple of assertions on gauge metrics?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is only 1 gauge metric now, and have tested in inflight_gauge_nets_to_zero_after_balanced_calls, but i added also here for completeness

Thanks!

@charlesdong1991
Copy link
Contributor Author

Thanks for the callout on tests @leekeiabstraction Addressed both! PTAL! cc @luoyuxia

Copy link
Contributor

@leekeiabstraction leekeiabstraction left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TY for the revision. Added a couple more comments.

(Sorry for reviewing in waves, don't have my computer with me atm so it's easy to miss things reviewing on the phone).

@charlesdong1991
Copy link
Contributor Author

thanks for suggestions! @leekeiabstraction updated

Copy link
Contributor

@leekeiabstraction leekeiabstraction left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved. TY for your contribution!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Client metrics framework

4 participants