Skip to content

GCS: Throw NotFoundException for inexistent input GCS file#15734

Open
findinpath wants to merge 1 commit intoapache:mainfrom
findinpath:findinpath/gcs-not-found
Open

GCS: Throw NotFoundException for inexistent input GCS file#15734
findinpath wants to merge 1 commit intoapache:mainfrom
findinpath:findinpath/gcs-not-found

Conversation

@findinpath
Copy link
Contributor

@findinpath findinpath commented Mar 23, 2026

Signal early to the TableOperations that there is no retry needed for files which do not exist.

Additional context

Relevant apache/iceberg code using this change

Tasks.foreach(newLocation)
.retry(numRetries)
.exponentialBackoff(100, 5000, 600000, 4.0 /* 100, 400, 1600, ... */)
.throwFailureWhenFinished()
.stopRetryOn(NotFoundException.class) // overridden if shouldRetry is non-null
.shouldRetryTest(shouldRetry)
.run(metadataLocation -> newMetadata.set(metadataLoader.apply(metadataLocation)));

Issue found while testing GCS credentials vending on apache/iceberg-rest-fixture:1.10.1 on trinodb/trino trinodb/trino#28423

[qtp1357563986-31] WARN org.apache.iceberg.util.Tasks - Retrying task after failure: sleepTimeMs=403 Failed to read file: gs://trino-ci-test/gcs-vending-rest-test-w9a718ba0b/tpch/test_drop_table_with_missing_metadata_file_a422ditwmf-555e6d30e3834820993299f645ee11c1/metadata/00000-6a7be733-c273-463a-8700-0f02b17561b8.metadata.json
org.apache.iceberg.exceptions.RuntimeIOException: Failed to read file: gs://trino-ci-test/gcs-vending-rest-test-w9a718ba0b/tpch/test_drop_table_with_missing_metadata_file_a422ditwmf-555e6d30e3834820993299f645ee11c1/metadata/00000-6a7be733-c273-463a-8700-0f02b17561b8.metadata.json
	at org.apache.iceberg.TableMetadataParser.read(TableMetadataParser.java:311)
	at org.apache.iceberg.TableMetadataParser.read(TableMetadataParser.java:294)
	at org.apache.iceberg.BaseMetastoreTableOperations.lambda$refreshFromMetadataLocation$0(BaseMetastoreTableOperations.java:180)
	at org.apache.iceberg.BaseMetastoreTableOperations.lambda$refreshFromMetadataLocation$1(BaseMetastoreTableOperations.java:199)
	at org.apache.iceberg.util.Tasks$Builder.runTaskWithRetry(Tasks.java:413)
	at org.apache.iceberg.util.Tasks$Builder.runSingleThreaded(Tasks.java:219)
	at org.apache.iceberg.util.Tasks$Builder.run(Tasks.java:203)
	at org.apache.iceberg.util.Tasks$Builder.run(Tasks.java:196)
	at org.apache.iceberg.BaseMetastoreTableOperations.refreshFromMetadataLocation(BaseMetastoreTableOperations.java:199)
	at org.apache.iceberg.BaseMetastoreTableOperations.refreshFromMetadataLocation(BaseMetastoreTableOperations.java:176)
	at org.apache.iceberg.BaseMetastoreTableOperations.refreshFromMetadataLocation(BaseMetastoreTableOperations.java:167)
	at org.apache.iceberg.jdbc.JdbcTableOperations.doRefresh(JdbcTableOperations.java:100)
	at org.apache.iceberg.BaseMetastoreTableOperations.refresh(BaseMetastoreTableOperations.java:88)
	at org.apache.iceberg.BaseMetastoreTableOperations.current(BaseMetastoreTableOperations.java:71)
	at org.apache.iceberg.BaseMetastoreCatalog.loadTable(BaseMetastoreCatalog.java:49)
	at org.apache.iceberg.rest.CatalogHandlers.loadTable(CatalogHandlers.java:329)
	at org.apache.iceberg.rest.RESTCatalogAdapter.handleRequest(RESTCatalogAdapter.java:420)
	at org.apache.iceberg.rest.RESTServerCatalogAdapter.handleRequest(RESTServerCatalogAdapter.java:42)
	at org.apache.iceberg.rest.RESTCatalogAdapter.execute(RESTCatalogAdapter.java:628)
	at org.apache.iceberg.rest.RESTCatalogAdapter.execute(RESTCatalogAdapter.java:609)
	at org.apache.iceberg.rest.RESTCatalogServlet.execute(RESTCatalogServlet.java:108)
	at org.apache.iceberg.rest.RESTCatalogServlet.doGet(RESTCatalogServlet.java:66)
	at jakarta.servlet.http.HttpServlet.service(HttpServlet.java:500)
	at jakarta.servlet.http.HttpServlet.service(HttpServlet.java:587)
	at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:764)
	at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:529)
	at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:131)
	at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:122)
	at org.eclipse.jetty.server.handler.gzip.GzipHandler.handle(GzipHandler.java:822)
	at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:122)
	at org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:223)
	at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1381)
	at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:176)
	at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:484)
	at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:174)
	at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1303)
	at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:129)
	at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:122)
	at org.eclipse.jetty.server.Server.handle(Server.java:563)
	at org.eclipse.jetty.server.HttpChannel$RequestDispatchable.dispatch(HttpChannel.java:1598)
	at org.eclipse.jetty.server.HttpChannel.dispatch(HttpChannel.java:753)
	at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:501)
	at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:287)
	at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:314)
	at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:100)
	at org.eclipse.jetty.io.SelectableChannelEndPoint$1.run(SelectableChannelEndPoint.java:53)
	at org.eclipse.jetty.util.thread.strategy.AdaptiveExecutionStrategy.runTask(AdaptiveExecutionStrategy.java:421)
	at org.eclipse.jetty.util.thread.strategy.AdaptiveExecutionStrategy.consumeTask(AdaptiveExecutionStrategy.java:390)
	at org.eclipse.jetty.util.thread.strategy.AdaptiveExecutionStrategy.tryProduce(AdaptiveExecutionStrategy.java:277)
	at org.eclipse.jetty.util.thread.strategy.AdaptiveExecutionStrategy.run(AdaptiveExecutionStrategy.java:199)
	at org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:411)
	at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:969)
	at org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.doRunJob(QueuedThreadPool.java:1194)
	at org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:1149)
	at java.base/java.lang.Thread.run(Thread.java:840)
Caused by: java.io.IOException: com.google.cloud.storage.StorageException: 404 Not Found
GET https://storage.googleapis.com/download/storage/v1/b/trino-ci-test/o/gcs-vending-rest-test-w9a718ba0b%2Ftpch%2Ftest_drop_table_with_missing_metadata_file_a422ditwmf-555e6d30e3834820993299f645ee11c1%2Fmetadata%2F00000-6a7be733-c273-463a-8700-0f02b17561b8.metadata.json?alt=media
No such object: trino-ci-test/gcs-vending-rest-test-w9a718ba0b/tpch/test_drop_table_with_missing_metadata_file_a422ditwmf-555e6d30e3834820993299f645ee11c1/metadata/00000-6a7be733-c273-463a-8700-0f02b17561b8.metadata.json
	at com.google.cloud.storage.BaseStorageReadChannel.read(BaseStorageReadChannel.java:143)
	at org.apache.iceberg.gcp.gcs.GCSInputStream.read(GCSInputStream.java:177)
	at org.apache.iceberg.gcp.gcs.GCSInputStream.read(GCSInputStream.java:141)
	at com.fasterxml.jackson.core.json.ByteSourceJsonBootstrapper.ensureLoaded(ByteSourceJsonBootstrapper.java:547)
	at com.fasterxml.jackson.core.json.ByteSourceJsonBootstrapper.detectEncoding(ByteSourceJsonBootstrapper.java:137)
	at com.fasterxml.jackson.core.json.ByteSourceJsonBootstrapper.constructParser(ByteSourceJsonBootstrapper.java:266)
	at com.fasterxml.jackson.core.JsonFactory._createParser(JsonFactory.java:1874)
	at com.fasterxml.jackson.core.JsonFactory.createParser(JsonFactory.java:1273)
	at com.fasterxml.jackson.databind.ObjectMapper.readValue(ObjectMapper.java:3924)
	at org.apache.iceberg.TableMetadataParser.read(TableMetadataParser.java:309)
	... 54 more
Caused by: com.google.cloud.storage.StorageException: 404 Not Found
GET https://storage.googleapis.com/download/storage/v1/b/trino-ci-test/o/gcs-vending-rest-test-w9a718ba0b%2Ftpch%2Ftest_drop_table_with_missing_metadata_file_a422ditwmf-555e6d30e3834820993299f645ee11c1%2Fmetadata%2F00000-6a7be733-c273-463a-8700-0f02b17561b8.metadata.json?alt=media

@findinpath findinpath force-pushed the findinpath/gcs-not-found branch from d2f525d to cdde9bd Compare March 24, 2026 15:24
Signal to the TableOperations that there is no retry needed
for files which do not exist.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants