Skip to content

feat(persistent): Cosmos persistent load test metrics recording#64

Merged
mattac21 merged 5 commits intomainfrom
ma/log-tps
Mar 17, 2026
Merged

feat(persistent): Cosmos persistent load test metrics recording#64
mattac21 merged 5 commits intomainfrom
ma/log-tps

Conversation

@mattac21
Copy link
Copy Markdown
Collaborator

@mattac21 mattac21 commented Mar 17, 2026

Closes: STACK-2504

@linear
Copy link
Copy Markdown

linear bot commented Mar 17, 2026

@mattac21 mattac21 marked this pull request as ready for review March 17, 2026 16:16
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Mar 17, 2026

Greptile Summary

This PR adds Prometheus broadcast-success/failure counters and a 30-second TPS logging tracker to the Cosmos persistent load test runner, mirroring patterns already used in the Ethereum runner. The runner integration is clean, but there is a critical registration issue in the new metrics package.

  • Critical (chains/cosmos/metrics/metrics.go): prometheus.MustRegister is called against the global default registry. Any scenario where NewRunner is called more than once with MetricsEnabled: true in the same process (including test suites) will produce a panic due to duplicate metric registration.
  • Minor (chains/cosmos/metrics/tps_logger.go): The TPSTracker does not emit a final log on shutdown, silently discarding the last partial 30-second window's counts. Close() also has no protection against a double-close panic.
  • Style (chains/cosmos/metrics/metrics.go): PromNamespace and TxMetricsNamespace are declared as var but are never mutated — they should be const.

Confidence Score: 2/5

  • Not safe to merge as-is — the MustRegister call will panic in any multi-runner or test context.
  • The prometheus.MustRegister usage in NewMetrics() is a hard panic waiting to happen as soon as the function is called more than once in a process (e.g., in tests or if the runner is restarted). All other changes are sound, but this one issue blocks safe use.
  • chains/cosmos/metrics/metrics.go requires immediate attention for the duplicate-registration panic.

Important Files Changed

Filename Overview
chains/cosmos/metrics/metrics.go Introduces Prometheus counters for broadcast success/failure. Critical issue: uses prometheus.MustRegister against the global registry, which will panic if NewMetrics() is called more than once in the same process (e.g., multiple runners or tests). Namespace vars should also be constants.
chains/cosmos/metrics/tps_logger.go Adds a background goroutine that logs TPS every 30 seconds using atomic counters. Correctly uses sync/atomic for thread safety. Two minor issues: no final flush when Close() is called (last window's data is lost), and close(t.done) is not guarded against a double-close panic.
chains/cosmos/runner/persistent.go Integrates TPS tracking and Prometheus metrics into the persistent runner. submitLoadPersistent now returns success/attempt counts. The sendTxs goroutine correctly adds an early return after error handling before recording broadcast success. Logic looks correct overall.
chains/cosmos/runner/runner.go Adds promMetrics *metrics.Metrics field to Runner and conditionally initializes it when MetricsEnabled is set. Straightforward and well-integrated with the existing structure.

Last reviewed commit: 091fe76

mattac21 and others added 2 commits March 17, 2026 12:23
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
@mattac21
Copy link
Copy Markdown
Collaborator Author

@greptile rearview knowing that when running catalyst, only a single runner is ever created for this instance of catalyst, so the duplicate registration callout is not an issue.

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Mar 17, 2026

Got it - that makes sense. Since there's only one runner per Catalyst instance, the duplicate registration concern doesn't apply here.

If this single-runner-per-instance constraint isn't immediately obvious from the code or project structure, it might be worth a brief comment noting that assumption, especially if the registry is shared/global. But if it's well-understood in your architecture, feel free to disregard.

@mattac21 mattac21 merged commit cca032b into main Mar 17, 2026
3 checks passed
@mattac21 mattac21 deleted the ma/log-tps branch March 17, 2026 18:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants