[RPC Metric Part 1] Support two basic metrics in RPC client : Latency and error rate by guandali · Pull Request #89 · smartcontractkit/chainlink-framework

guandali · 2026-03-13T08:07:42Z

Description

Allow RPC client to capture two metrics rpc_call_latency and rpc_call_errors_total , so it later can be exported to beholder and engineers can examine the RPC reliability per endpoint, and make informed decsion.
it will be part of this dashboard https://grafana.ops.prod.cldev.sh/goto/cfgwx9lcfer5sa?orgId=1

Requires Dependencies

Resolves Dependencies

…ency, RPC error rate

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: a3174e7ea3

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

multinode/go.mod

vlfig

Couple of comments, grab me if you need.

vlfig · 2026-03-18T18:10:43Z

metrics/rpc_client.go

+// RPCClientMetricsConfig holds labels for RPC client metrics.
+// Empty strings are allowed; they will still be emitted as labels for filtering.
+type RPCClientMetricsConfig struct {
+	Env         string // e.g. "staging", "production"


I don't think this label should ever be populated by the application itself.

vlfig · 2026-03-18T18:13:27Z

metrics/rpc_client.go

+	Env         string // e.g. "staging", "production"
+	Network     string // chain/network name
+	ChainID     string // chain ID
+	RPCProvider string // RPC provider or node name (optional)


Where does this come from? I'd imagine this being called from logResult in rpc_client.go.

vlfig · 2026-03-18T18:18:27Z

metrics/rpc_client.go

@@ -0,0 +1,125 @@
+// RPC client observability using Beholder.


Instead of a new module I'd imagine this becoming an expansion of metrics/client.go, which already has a (promauto) latency metric, to 1) include beholder as a "target" like in metrics/multinode.go; and 2) add the request error rate metric.

vlfig · 2026-03-18T18:26:04Z

docs/rpc_observability.md

@@ -0,0 +1,54 @@
+# RPC Observability (Beholder)


Not sure you intended to commit this. I do like the idea of having a /docs folder in this style but I think that's better pursued with a broader effort. Leaving such a slim slice here would end up be more confusing, I think.

vlfig · 2026-03-18T18:26:31Z

docs/rpc_observability.md

+
+Create `RPCClientMetrics` with `metrics.NewRPCClientMetrics(metrics.RPCClientMetricsConfig{...})` and pass it as the last argument to `multinode.NewRPCClientBase(...)`. The follow-up interface refactor will make it easier for multinode/chain integrations to supply `env`, `network`, `chain_id`, and `rpc_provider`.
+
+## Follow-up: multinode integration (PR 2)


Better not mix the plan of what to do with description of what is.

vlfig

Couple of comments. Do tag Dmytro once you feel we're past these. I think we'll need approval from someone other than me.

vlfig · 2026-03-20T18:13:17Z

metrics/client.go

 	RPCCallLatency = promauto.NewHistogramVec(prometheus.HistogramOpts{
 		Name: "rpc_call_latency",
-		Help: "The duration of an RPC call in milliseconds",
+		Help: "The duration of an RPC call in seconds",


You sure about this?

metrics/client.go

vlfig · 2026-03-20T18:53:13Z

metrics/client.go

+		latency:     latency,
+		errorsTotal: errorsTotal,


maybe call them latencyHist and errorsCounter for clarity?

metrics/client.go

dhaidashenko · 2026-03-24T11:48:36Z

metrics/client.go

+		Name: rpcCallLatencyBeholder,
 		Help: "The duration of an RPC call in milliseconds",
 		Buckets: []float64{
-			float64(50 * time.Millisecond),


If we change bucket size here from nanoseconds to milliseconds, we'll need to change the values that we report, but this is a breaking change. Other teams and NOPs may already depend on the values being in nanoseconds.

dhaidashenko · 2026-03-24T11:49:30Z

metrics/client.go

 		},
 	}, []string{"chainFamily", "chainID", "rpcUrl", "isSendOnly", "success", "rpcCallName"})
+
+	RPCCallErrorsTotal = promauto.NewCounterVec(prometheus.CounterOpts{


Why do we need a dedicated metric for errors, can't we derive the value using RPCCallLatency and sucess label?

made the change so we can do this
sum by (chainFamily, chainID, rpcUrl, isSendOnly, rpcCallName) ( rate(rpc_call_latency_count{success="false"}[5m]) )

publish two basic metrics from the RPC client into Beholder: RPC lat…

a3174e7

…ency, RPC error rate

chatgpt-codex-connector bot reviewed Mar 13, 2026

View reviewed changes

multinode/go.mod Outdated Show resolved Hide resolved

update

24a83d3

guandali requested a review from a team as a code owner March 13, 2026 08:19

product-security-plaid-production bot requested a review from dhaidashenko March 13, 2026 08:20

guandali and others added 8 commits March 16, 2026 14:38

update

2f3fab6

update

b892ff4

revert changes

b5471e0

update

0b3bc02

update

65f6952

update

0e4e902

update

cd5fc94

Merge branch 'main' into lli/rpc-beholder-metric

2925e01

guandali mentioned this pull request Mar 17, 2026

[RPC Metric Part 2] RPC Metrics Integration in Multinode #91

Open

guandali changed the title ~~publish two basic metrics from the RPC client into Beholder: Latency and error rate~~ [RPC Metric Part 1] publish two basic metrics from the RPC client into Beholder: Latency and error rate Mar 17, 2026

guandali changed the title ~~[RPC Metric Part 1] publish two basic metrics from the RPC client into Beholder: Latency and error rate~~ [RPC Metric Part 1] Support two basic metrics in RPC client : Latency and error rate Mar 17, 2026

vlfig requested changes Mar 18, 2026

View reviewed changes

update

7a913aa

guandali requested a review from vlfig March 20, 2026 14:45

vlfig reviewed Mar 20, 2026

View reviewed changes

update

4ec82db

guandali requested review from dhaidashenko and vlfig and removed request for dhaidashenko March 23, 2026 19:18

dhaidashenko requested changes Mar 24, 2026

View reviewed changes

update

d838d80

guandali force-pushed the lli/rpc-beholder-metric branch from 822f005 to d838d80 Compare March 24, 2026 15:57

Merge branch 'main' into lli/rpc-beholder-metric

a1c4b60

		@@ -0,0 +1,125 @@
		// RPC client observability using Beholder.


		Create `RPCClientMetrics` with `metrics.NewRPCClientMetrics(metrics.RPCClientMetricsConfig{...})` and pass it as the last argument to `multinode.NewRPCClientBase(...)`. The follow-up interface refactor will make it easier for multinode/chain integrations to supply `env`, `network`, `chain_id`, and `rpc_provider`.

		## Follow-up: multinode integration (PR 2)

Conversation

guandali commented Mar 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Requires Dependencies

Resolves Dependencies

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

vlfig left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

vlfig left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

guandali commented Mar 13, 2026 •

edited

Loading