[Druid Iceberg Extension] Add column projection support to reduce I/O and improve query performance

Currently, the Druid Iceberg extension reads ALL columns from Iceberg data files regardless of which columns are needed for ingestion. For tables with hundreds of columns, this causes:
- 10-100x unnecessary data read from storage
- Increased memory pressure during ingestion
- Slower query performance
- Higher cloud storage egress costs

An e-commerce analytics team has an Iceberg table with 150 columns but only needs 5 columns (timestamp, product_id, category, price, quantity) for their Druid dashboard. Currently, Druid reads all 150 columns, causing:
- Query time:
- Memory:
- Data transfer: 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Druid Iceberg Extension] Add column projection support to reduce I/O and improve query performance #19267

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

[Druid Iceberg Extension] Add column projection support to reduce I/O and improve query performance #19267

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions