Skip to content

feat: gRPC sync endpoint metrics#1861

Open
alxckn wants to merge 7 commits intoopen-feature:mainfrom
alxckn:grpc_sync_metrics
Open

feat: gRPC sync endpoint metrics#1861
alxckn wants to merge 7 commits intoopen-feature:mainfrom
alxckn:grpc_sync_metrics

Conversation

@alxckn
Copy link
Member

@alxckn alxckn commented Feb 3, 2026

This PR

Adds OpenTelemetry metrics for the gRPC flag sync service to enable monitoring of sync connections and operations.

Metrics

Custom Metrics:

Metric Type Description
feature_flag.flagd.sync.active_streams Gauge Currently active streaming connections
feature_flag.flagd.sync.stream.duration Histogram Duration of streaming connections (seconds)

Standard gRPC Metrics:

  • Leverages otelgrpc.NewServerHandler() for comprehensive gRPC server metrics (request counts, latencies, status codes, etc.)

Changes

  • Add custom sync metric definitions in core/pkg/telemetry/metrics.go
  • Instrument SyncFlags streaming RPC with stream lifecycle metrics
  • Enable standard gRPC metrics via OpenTelemetry gRPC instrumentation
  • Add unit tests for metric collection, histogram buckets, and NoopMetricsRecorder

@netlify
Copy link

netlify bot commented Feb 3, 2026

Deploy Preview for polite-licorice-3db33c canceled.

Name Link
🔨 Latest commit 8ab1c63
🔍 Latest deploy log https://app.netlify.com/projects/polite-licorice-3db33c/deploys/69b9810e952c7f0008eed10d

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @alxckn, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request enhances the observability of the gRPC flag synchronization endpoint by integrating comprehensive OpenTelemetry metrics. It provides critical insights into the performance and usage patterns of streaming connections and unary flag fetch operations, enabling better monitoring and troubleshooting of the synchronization service.

Highlights

  • New gRPC Sync Metrics: Introduced new OpenTelemetry metrics for gRPC synchronization, including active streams, stream duration, events sent, and unary FetchAllFlags requests.
  • Telemetry Interface Extension: The IMetricsRecorder interface and its NoopMetricsRecorder implementation have been extended to support the new gRPC sync metrics.
  • gRPC Handler Instrumentation: The SyncFlags streaming handler and FetchAllFlags unary handler are now instrumented to record detailed metrics, including stream start/end, duration with exit reasons, and request status.
  • Metrics Recorder Integration: The SyncService configuration now accepts a MetricsRecorder instance, allowing for the injection and use of the new telemetry capabilities.
Changelog
  • core/pkg/telemetry/metrics.go
    • Added new constants for gRPC sync metric names (e.g., grpcSyncActiveStreamsMetric, grpcSyncStreamDurationMetric).
    • Extended IMetricsRecorder interface with new methods for gRPC sync operations (SyncStreamStart, SyncStreamEnd, SyncStreamDuration, SyncEventSent, FetchAllFlagsRequest).
    • Implemented the new IMetricsRecorder methods as no-ops in NoopMetricsRecorder.
    • Added new metric instruments (syncActiveStreams, syncStreamDuration, syncEventsSent, fetchAllFlagsRequests) to the MetricsRecorder struct.
    • Implemented the logic for recording these new metrics within the MetricsRecorder methods.
    • Configured a new OpenTelemetry view for grpcSyncStreamDurationMetric and initialized the new gRPC sync metric instruments in NewOTelRecorder.
  • flagd/pkg/runtime/from_config.go
    • Modified FromConfig to pass the recorder (metrics recorder) to the SyncService configuration.
  • flagd/pkg/service/flag-sync/handler.go
    • Imported telemetry and attribute packages.
    • Added metricsRecorder field to the syncHandler struct.
    • Instrumented the SyncFlags streaming handler to record SyncStreamStart, SyncStreamEnd, SyncStreamDuration (with exit reasons), and SyncEventSent metrics.
    • Instrumented the FetchAllFlags unary handler to record FetchAllFlagsRequest metrics, including request status (ok/error).
  • flagd/pkg/service/flag-sync/handler_test.go
    • Imported telemetry package.
    • Updated test cases for TestSyncHandler_SyncFlags and TestSyncHandler_SelectorLocationPrecedence to inject a telemetry.NoopMetricsRecorder into the syncHandler for testing purposes.
  • flagd/pkg/service/flag-sync/sync_service.go
    • Imported telemetry package.
    • Added MetricsRecorder field to the SvcConfigurations struct.
    • Modified NewSyncService to accept and pass the MetricsRecorder from SvcConfigurations to the syncHandler, defaulting to NoopMetricsRecorder if not provided.
Activity
  • This pull request introduces a new feature related to gRPC sync endpoint metrics.
  • It addresses and fixes issue #1234523.
  • No specific human activity (e.g., comments, reviews) has been recorded for this pull request yet.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@alxckn alxckn force-pushed the grpc_sync_metrics branch from 6a021ab to 4cd381d Compare February 3, 2026 19:26
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces metrics for the gRPC sync endpoint, which is a great addition for observability. The implementation is solid, adding new metrics, updating the recorder interface, and integrating metric collection into the sync handlers. I've suggested one refactoring to improve code clarity and reduce duplication in the FetchAllFlags handler. Overall, this is a well-executed feature.

@alxckn alxckn force-pushed the grpc_sync_metrics branch 2 times, most recently from 09a5f5e to 0d8ad6f Compare February 10, 2026 14:23
@alxckn alxckn marked this pull request as ready for review February 10, 2026 15:24
@alxckn alxckn requested review from a team as code owners February 10, 2026 15:24
@dosubot dosubot bot added the size:L This PR changes 100-499 lines, ignoring generated files. label Feb 10, 2026
@toddbaert
Copy link
Member

Good idea @alxckn , but I think some of these are close to the OTel defaults. I propose we:

use otelgrpc.NewServerHandler() on the sync service's gRPC server. IIUC, this gives us automatically:

  • rpc.server.duration
  • rpc.server.request.size / rpc.server.response.size
  • Message counts per method
  • Error rates by code

custom from this PR, use:

  • grpc.sync.active_streams
  • grpc.sync.stream.duration

I think this will give us something more standard for the standard metrics, but also the really nice custom metrics you've invented that are specific to flagd.

WDYT?

@sonarqubecloud
Copy link

@alxckn
Copy link
Member Author

alxckn commented Feb 12, 2026

@toddbaert @erka thank you for your reviews. I replaced custom metrics with oTel defaults and kept the 2 custom metrics that add value (renamed to be more coherent with custom metrics defined in the connect service).

Other review comments and sonar recommendations adressed.

@toddbaert toddbaert requested a review from guidobrei February 17, 2026 16:26
@alxckn
Copy link
Member Author

alxckn commented Mar 17, 2026

@toddbaert do you think this PR may be merged and released soon ? I see you requested additional reviews

I will take care of the conflicts

alxckn added 7 commits March 17, 2026 17:22
Signed-off-by: Alexandre Chakroun <achakroun@macmail.fr>
Signed-off-by: Alexandre Chakroun <achakroun@macmail.fr>
Signed-off-by: Alexandre Chakroun <achakroun@macmail.fr>
Signed-off-by: Alexandre Chakroun <achakroun@macmail.fr>
Signed-off-by: Alexandre Chakroun <achakroun@macmail.fr>
Signed-off-by: Alexandre Chakroun <achakroun@macmail.fr>
Signed-off-by: Alexandre Chakroun <achakroun@macmail.fr>
@alxckn alxckn force-pushed the grpc_sync_metrics branch from a790343 to 8ab1c63 Compare March 17, 2026 16:27
@sonarqubecloud
Copy link

@beeme1mr beeme1mr requested a review from toddbaert March 17, 2026 17:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size:L This PR changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants