Skip to content

Better cardinality estimation for queries with many booleans (especially those not matched to any indices)#8940

Open
dyemanov wants to merge 3 commits intomasterfrom
work/optimizer-selectivity-backoff
Open

Better cardinality estimation for queries with many booleans (especially those not matched to any indices)#8940
dyemanov wants to merge 3 commits intomasterfrom
work/optimizer-selectivity-backoff

Conversation

@dyemanov
Copy link
Member

@dyemanov dyemanov commented Mar 11, 2026

  • Refactor selectivity estimations
  • Apply exponential backoff to the selectivity combined from multiple booleans

In real-world SQL queries, conjuncts (ANDed booleans) are often inter-dependent and simple multiplication of selectivities (that we used so far) results to a very low final selectivity value, thus causing the stream cardinality being under-estimated. To avoid this, apply exponential backoff adjustment:

sel = sel1 * sqrt(sel2) * sqrt(sqrt(sel3)) * ... where sel1 is the least (best) selectivity and selN is the biggest (worst) one

I don't pretend this is the best appoach possible, but it seems working fine for MSSQL, so it's worth trying (actually, it was already tested in production -- with quite good results so far).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant