Add proposal for per-tenant TSDB status API#7335
Add proposal for per-tenant TSDB status API#7335CharlieTLe wants to merge 1 commit intocortexproject:masterfrom
Conversation
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: Charlie Le <charlie_le@apple.com>
|
|
||
| Currently, Cortex tenants lack visibility into which metrics, labels, and label-value pairs contribute the most series in ingesters. Without this information, debugging high-cardinality issues requires operators to inspect TSDB internals directly on ingester instances, which is impractical in a multi-tenant, distributed environment. | ||
|
|
||
| Prometheus itself exposes a `/api/v1/status/tsdb` endpoint that provides cardinality statistics from the TSDB head. This proposal brings equivalent functionality to Cortex as a multi-tenant, distributed API. |
There was a problem hiding this comment.
I am not a fan of TSDB status API name... Prometheus API might change and add more stuff. A dedicated api/v1/cardinality might be better?
|
|
||
| ## Out of Scope | ||
|
|
||
| - **Long-term storage cardinality analysis**: This endpoint only covers in-memory TSDB head data in ingesters. Analyzing cardinality across compacted blocks in object storage is a separate concern. A future long-term cardinality API could reuse portable fields (see [Extensibility](#extensibility-to-long-term-storage)) or introduce a separate endpoint. |
There was a problem hiding this comment.
Do we plan to have a different API for long term storage cardinality? We should aim for the same API endpoint even though we don't have to design for it now
|
|
||
| Expose per-tenant TSDB head cardinality statistics via a REST API endpoint on the Cortex query path. The endpoint should: | ||
|
|
||
| 1. Be compatible with the Prometheus `/api/v1/status/tsdb` response format. |
There was a problem hiding this comment.
I am not sure if this needs to be as part of the goal. Does it need to be compatible.
I think our API response format is already incompatible today
| ``` | ||
|
|
||
| - **Authentication**: Requires `X-Scope-OrgID` header (standard Cortex tenant authentication). | ||
| - **Query Parameter**: `limit` (optional, default 10) - controls the number of top items returned per category. |
| message TSDBStatusResponse { | ||
| uint64 num_series = 1; | ||
| int64 min_time = 2; | ||
| int64 max_time = 3; |
There was a problem hiding this comment.
Do we need min max? How do we aggregate this in the final response? min(min_t) and max(max_t)?
|
|
||
| 2. **`chunkCount` omitted**: Prometheus includes a `chunkCount` field (from `prometheus_tsdb_head_chunks`). In a distributed system with replication, chunk counts across ingesters cannot be meaningfully aggregated — chunks are an ingester-local storage detail, and summing/dividing by the replication factor does not produce a useful number. | ||
|
|
||
| **Open question**: Should we adopt the `headStats` wrapper to maintain client compatibility with Prometheus tooling? The trade-off is compatibility vs simplicity — the flat format is easier to consume for Cortex-specific clients, but adopting the Prometheus format would allow reuse of existing client libraries. |
There was a problem hiding this comment.
Any Prometheus tool consumes this today? Why compatibility is a concern
| | `labelValueCountByLabelName` | No | Portable to block storage | | ||
| | `seriesCountByLabelValuePair` | No | Portable to block storage | | ||
| | `memoryInBytesByLabelName` | **Yes** | In-memory byte usage has no analogue in object storage | | ||
| | `minTime` / `maxTime` | **Yes** | Reflects head time range, not total storage | |
There was a problem hiding this comment.
Do we need to add those head specific fields?
Design proposal for the /api/v1/status/tsdb endpoint that exposes per-tenant TSDB head cardinality statistics. Covers architecture (Distributor fan-out to Ingesters), gRPC definitions, aggregation logic, Prometheus compatibility trade-offs, extensibility to long-term storage, and Distributor vs Querier routing alternatives.