Affected Version
36.0.0
Description
A timeseries query with a non-ALL granularity (e.g., DAY) over an UnnestDataSource effectively hangs when the query interval is very wide (e.g., ETERNITY). The equivalent groupBy query returns instantly.
This affects SQL queries using FLOOR(__time TO DAY) (or similar time floor expressions like TIME_FLOOR or DATE_TRUNC probably) combined with CROSS JOIN UNNEST, as the SQL planner converts these into timeseries queries with DAY granularity.
An LLM found the following root cause for the issue (this should be verified, I do not have enough knowledge to judge if it's correct):
CursorGranularizer.create() clips the query interval using timeBoundaryInspector when available. For unnest datasources where the inspector is absent or the segment boundaries are wide, the ETERNITY interval
is not clipped, causing granularity.getIterable() to generate billions of day-sized buckets. The engine then iterates over all of them, even though skipEmptyBuckets is set -- it still calls advanceToBucket() for each one.
The SQL planner's canUseQueryGranularity() check (DruidQuery.java) is meant to prevent this for non-table datasources by rejecting ETERNITY intervals. However, ExecutionVertex.isTableBased() traverses through UnnestDataSource down to the underlying TableDataSource, returning true and bypassing the safety check.
Reproduction
Submit this native query to /druid/v2 -- no datasource creation needed:
{
"queryType": "timeseries",
"dataSource": {
"type": "unnest",
"base": {
"type": "inline",
"columnNames": ["__time", "vals"],
"columnTypes": ["LONG", "ARRAY<LONG>"],
"rows": [
[1704067200000, [10, 20, 30]],
[1704153600000, [40, 50, 60]]
]
},
"virtualColumn": {
"type": "expression",
"name": "unnested",
"expression": "\"vals\"",
"outputType": "ARRAY<LONG>"
}
},
"intervals": "-146136543-09-08T08:23:32.096Z/146140482-04-24T15:36:27.903Z",
"granularity": "DAY",
"aggregations": [
{ "type": "count", "name": "cnt" }
],
"context": {
"skipEmptyBuckets": true,
"timeout": 10000
}
}
Expected: Returns results (2 day buckets, 3 rows each).
Actual: Times out.
Narrowing intervals to "2024-01-01/2024-01-03" returns instantly. The equivalent groupBy query also returns instantly.
SQL Reproduction
-- Times out: planner chooses timeseries with DAY granularity
SELECT
FLOOR(__time TO DAY) AS aggregation,
AVG(CAST(val AS DOUBLE)) AS avg_val
FROM "any_datasource_with_array_column"
CROSS JOIN UNNEST(array_column) AS d (val)
GROUP BY FLOOR(__time TO DAY)
ORDER BY aggregation
-- Works: planner chooses groupBy
SELECT
__time AS aggregation,
AVG(CAST(val AS DOUBLE)) AS avg_val
FROM "any_datasource_with_array_column"
CROSS JOIN UNNEST(array_column) AS d (val)
GROUP BY __time
ORDER BY aggregation
The only difference is FLOOR(__time TO DAY) vs raw __time. The former triggers a timeseries plan with DAY granularity; the latter triggers a groupBy plan that completes immediately.
The same applies to TIME_FLOOR(__time, 'P1D') and DATE_TRUNC('day', __time) -- any expression the planner recognizes as a time floor on __time.
Affected Version
36.0.0
Description
A timeseries query with a non-ALL granularity (e.g., DAY) over an UnnestDataSource effectively hangs when the query interval is very wide (e.g., ETERNITY). The equivalent groupBy query returns instantly.
This affects SQL queries using FLOOR(__time TO DAY) (or similar time floor expressions like TIME_FLOOR or DATE_TRUNC probably) combined with CROSS JOIN UNNEST, as the SQL planner converts these into timeseries queries with DAY granularity.
An LLM found the following root cause for the issue (this should be verified, I do not have enough knowledge to judge if it's correct):
CursorGranularizer.create() clips the query interval using timeBoundaryInspector when available. For unnest datasources where the inspector is absent or the segment boundaries are wide, the ETERNITY interval
is not clipped, causing granularity.getIterable() to generate billions of day-sized buckets. The engine then iterates over all of them, even though skipEmptyBuckets is set -- it still calls advanceToBucket() for each one.
The SQL planner's canUseQueryGranularity() check (DruidQuery.java) is meant to prevent this for non-table datasources by rejecting ETERNITY intervals. However, ExecutionVertex.isTableBased() traverses through UnnestDataSource down to the underlying TableDataSource, returning true and bypassing the safety check.
Reproduction
Submit this native query to /druid/v2 -- no datasource creation needed:
Expected: Returns results (2 day buckets, 3 rows each).
Actual: Times out.
Narrowing intervals to "2024-01-01/2024-01-03" returns instantly. The equivalent groupBy query also returns instantly.
SQL Reproduction
The only difference is FLOOR(__time TO DAY) vs raw __time. The former triggers a timeseries plan with DAY granularity; the latter triggers a groupBy plan that completes immediately.
The same applies to TIME_FLOOR(__time, 'P1D') and DATE_TRUNC('day', __time) -- any expression the planner recognizes as a time floor on __time.