[BugFix] Fix PushDownContext shallow copy bug by songkant-aws · Pull Request #5199 · opensearch-project/sql

songkant-aws · 2026-03-04T08:54:40Z

Description

PushdownContext.clone() / cloneWithoutSort() / cloneForAggregate() replays add(operation) to construct new context. Aggregation operations reuses the same AggPushDownAction instance in both old and new context due to shallow copy side effect happened during cloning PushdownContext. The following pushdown actions from other equivalent RelSubsets will modify this shared mutable action object, which could contaminate previous context's agg states.

Related Issues

Resolves #5125

Check List

New functionality includes testing.
New functionality has been documented.
New functionality has javadoc added.
New functionality has a user manual doc added.
New PPL command checklist all confirmed.
API changes companion pull request created.
Commits are signed per the DCO using --signoff or -s.
Public documentation issue/PR created.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Signed-off-by: Songkan Tang <songkant@amazon.com>

github-actions · 2026-03-04T08:56:48Z

PR Reviewer Guide 🔍

(Review updated until commit `05a2ed2`)

Here are some key observations to aid the review process:

🧪 PR contains tests
🔒 No security concerns identified
📝 TODO sections opensearch/src/main/java/org/opensearch/sql/opensearch/storage/scan/context/AggPushDownAction.java [438]: for MultiTermsAggregationBuilder
🔀 Multiple PR themes Sub-PR theme: Introduce immutable AggSpec and lazy build to fix shallow copy bug Relevant files: opensearch/src/main/java/org/opensearch/sql/opensearch/storage/scan/context/AggSpec.java opensearch/src/main/java/org/opensearch/sql/opensearch/storage/scan/context/PushDownContext.java opensearch/src/main/java/org/opensearch/sql/opensearch/storage/scan/AbstractCalciteIndexScan.java opensearch/src/main/java/org/opensearch/sql/opensearch/storage/scan/CalciteLogicalIndexScan.java opensearch/src/test/java/org/opensearch/sql/opensearch/storage/scan/CalciteIndexScanCostTest.java Sub-PR theme: Refactor AggPushDownAction copy helpers and histogram builder options Relevant files: opensearch/src/main/java/org/opensearch/sql/opensearch/storage/scan/context/AggPushDownAction.java opensearch/src/main/java/org/opensearch/sql/opensearch/request/OpenSearchRequestBuilder.java
⚡ Recommended focus areas for review Null Dereference In `pushDownSortAggregateMeasure` (CalciteLogicalIndexScan.java), `pushDownContext.getAggSpec()` is checked for null before calling `withSortMeasure`. However, in `AggSpec.withSortMeasure`, `measureSortTarget` can be null (it is `@Nullable`), and the method throws `PushDownUnSupportedException` when it is null. This is expected behavior, but in `AggSpec.withBucketSort`, if `bucketNames` does not contain the bucket name from the collation, it throws `PushDownUnSupportedException`. The concern is that `withoutBucketSort` resets `bucketNames` to `initialBucketNames`, but if `withBucketSort` is called with a name not in `initialBucketNames`, the error message says "Cannot pushdown sort into aggregation bucket" without indicating which name was missing, making debugging harder. This is minor but worth noting. public AggSpec withBucketSort(List<RelFieldCollation> collations, List<String> fieldNames) { if (kind != AggKind.COMPOSITE && kind != AggKind.TERMS) { throw new OpenSearchRequestBuilder.PushDownUnSupportedException( "Cannot pushdown sort into aggregation bucket"); } List<String> newBucketNames = bucketNames; if (kind == AggKind.COMPOSITE) { List<String> reordered = new ArrayList<>(bucketNames.size()); List<String> selected = new ArrayList<>(collations.size()); for (RelFieldCollation collation : collations) { String bucketName = fieldNames.get(collation.getFieldIndex()); if (!bucketNames.contains(bucketName)) { throw new OpenSearchRequestBuilder.PushDownUnSupportedException( "Cannot pushdown sort into aggregation bucket"); } reordered.add(bucketName); selected.add(bucketName); } for (String name : bucketNames) { if (!selected.contains(name)) { reordered.add(name); } } newBucketNames = reordered; } Bucket Mutation In `pushDownSortIntoCompositeBucket`, the code calls `bucket.order(order)` and `bucket.missingOrder(missingOrder)` directly on the existing `CompositeValuesSourceBuilder` instances from `compositeAggBuilder.sources()` before adding them to `newBuckets`. These mutations affect the original bucket objects that are shared with the existing `CompositeAggregationBuilder`. Since `AggSpec.build()` re-runs `AggregateAnalyzer.analyze` to get a fresh builder, this is safe in the new design. However, if the same `AggPushDownAction` instance is reused across multiple `build()` calls (which it is not, since `build()` creates a new `AggPushDownAction` each time), this would be a problem. The current design appears safe, but the mutation of source objects is fragile. List<CompositeValuesSourceBuilder<?>> buckets = compositeAggBuilder.sources(); List<CompositeValuesSourceBuilder<?>> newBuckets = new ArrayList<>(buckets.size()); List<String> newBucketNames = new ArrayList<>(buckets.size()); List<String> selected = new ArrayList<>(collations.size()); // Have to put the collation required buckets first, then the rest of buckets. collations.forEach( collation -> { /* Must find the bucket by field name because: 1. The sequence of buckets may have changed after sort push-down. 2. The schema of scan operator may be inconsistent with the sequence of buckets after project push-down. / String bucketName = fieldNames.get(collation.getFieldIndex()); CompositeValuesSourceBuilder<?> bucket = buckets.get(bucketNames.indexOf(bucketName)); RelFieldCollation.Direction direction = collation.getDirection(); RelFieldCollation.NullDirection nullDirection = collation.nullDirection; SortOrder order = RelFieldCollation.Direction.DESCENDING.equals(direction) ? SortOrder.DESC : SortOrder.ASC; if (bucket.missingBucket()) { MissingOrder missingOrder = switch (nullDirection) { case FIRST -> MissingOrder.FIRST; case LAST -> MissingOrder.LAST; default -> MissingOrder.DEFAULT; }; bucket.missingOrder(missingOrder); } newBuckets.add(bucket.order(order)); newBucketNames.add(bucketName); selected.add(bucketName); }); buckets.stream() .map(CompositeValuesSourceBuilder::name) .filter(name -> !selected.contains(name)) .forEach( name -> { newBuckets.add(buckets.get(bucketNames.indexOf(name))); newBucketNames.add(name); }); AggregationBuilder finalBuilder = AggregationBuilders.composite(compositeAggBuilder.getName(), newBuckets) .subAggregations(copySubAggregations(compositeAggBuilder)) .size(compositeAggBuilder.size()); replaceRootBuilder(original, finalBuilder); bucketNames = newBucketNames; Redundant Build Call* In `AggSpec.create`, a temporary `AggPushDownAction` is constructed solely to call `getScriptCount()`. This re-analyzes the aggregation just to count scripts, which is wasteful since `builderAndParser` is already available. The `AggPushDownAction` constructor is called with the full builder just to extract `scriptCount`, and the action object is discarded. This could be simplified by computing `scriptCount` directly from the builder without constructing a full `AggPushDownAction`. new AggPushDownAction(builderAndParser, extendedTypeMapping, bucketNames).getScriptCount(),

github-actions · 2026-03-04T08:57:33Z

PR Code Suggestions ✨

Latest suggestions up to 05a2ed2

Explore these optional code suggestions:

Category	Suggestion	Impact
Possible issue	Prevent duplicate bucket names in sort pushdown The `withBucketSort` method adds `bucketName` to both `reordered` and `selected` without checking for duplicates. If the same field appears in multiple collations, it will be added to `reordered` multiple times, leading to duplicate bucket names and potentially incorrect query generation. Add a duplicate check before adding to `reordered`. opensearch/src/main/java/org/opensearch/sql/opensearch/storage/scan/context/AggSpec.java [192-215] public AggSpec withBucketSort(List<RelFieldCollation> collations, List<String> fieldNames) { if (kind != AggKind.COMPOSITE && kind != AggKind.TERMS) { throw new OpenSearchRequestBuilder.PushDownUnSupportedException( "Cannot pushdown sort into aggregation bucket"); } List<String> newBucketNames = bucketNames; if (kind == AggKind.COMPOSITE) { List<String> reordered = new ArrayList<>(bucketNames.size()); List<String> selected = new ArrayList<>(collations.size()); for (RelFieldCollation collation : collations) { String bucketName = fieldNames.get(collation.getFieldIndex()); if (!bucketNames.contains(bucketName)) { throw new OpenSearchRequestBuilder.PushDownUnSupportedException( "Cannot pushdown sort into aggregation bucket"); } - reordered.add(bucketName); - selected.add(bucketName); + if (!selected.contains(bucketName)) { + reordered.add(bucketName); + selected.add(bucketName); + } } Suggestion importance[1-10]: 5 __ Why: The suggestion is logically valid - duplicate collation fields could cause duplicate entries in `reordered`. However, in practice, SQL/PPL query planners typically don't generate duplicate collation fields, making this an edge case. The fix is straightforward and correct.	Low
General	Preserve root cause in exception chaining In the second catch block, the variable `ignored` shadows the outer `ignored` variable and then immediately throws a new exception, discarding the original cause. The original exception should be chained to preserve the root cause for debugging. Pass the caught exception as the cause to `PushDownUnSupportedException`. opensearch/src/main/java/org/opensearch/sql/opensearch/storage/scan/context/AggPushDownAction.java [288-305] private static void copyDateHistogramInterval( DateHistogramValuesSourceBuilder source, Consumer<DateHistogramInterval> fixedIntervalSetter, Consumer<DateHistogramInterval> calendarIntervalSetter) { try { fixedIntervalSetter.accept(source.getIntervalAsFixed()); return; } catch (IllegalArgumentException \| IllegalStateException ignored) { // Fallback to calendar interval. } try { calendarIntervalSetter.accept(source.getIntervalAsCalendar()); return; - } catch (IllegalArgumentException \| IllegalStateException ignored) { + } catch (IllegalArgumentException \| IllegalStateException cause) { throw new OpenSearchRequestBuilder.PushDownUnSupportedException( - "Cannot copy interval for date histogram bucket " + source.name()); + "Cannot copy interval for date histogram bucket " + source.name(), cause); } } Suggestion importance[1-10]: 5 __ Why: The suggestion correctly identifies that the caught exception in the second `catch` block is discarded, losing the root cause. The `PushDownUnSupportedException(String, Throwable)` constructor was added in this PR specifically to support this pattern, making the fix directly applicable and improving debuggability.	Low
	Avoid unexpected exception in feasibility check method The `canPushDownLimitIntoBucketSize` method throws an exception for `UNSUPPORTED` mode, but callers may reasonably expect a boolean return to check feasibility without catching exceptions. The `UNSUPPORTED` case should return `false` instead of throwing, and the exception should only be thrown in `withLimit` where the actual push-down is attempted. opensearch/src/main/java/org/opensearch/sql/opensearch/storage/scan/context/AggSpec.java [181-190] public boolean canPushDownLimitIntoBucketSize(int size) { return switch (limitPushdownMode) { case BUCKET_SIZE -> bucketSize != null && size < bucketSize; case LEAF_METRIC -> true; - case ESTIMATE_ONLY -> false; - case UNSUPPORTED -> - throw new OpenSearchRequestBuilder.PushDownUnSupportedException( - "Cannot pushdown limit into aggregation bucket"); + case ESTIMATE_ONLY, UNSUPPORTED -> false; }; } Suggestion importance[1-10]: 4 __ Why: The suggestion has merit - a method named `canPushDownLimitIntoBucketSize` implies a boolean check, and throwing from it is surprising. However, the current callers in `pushDownLimit` already handle this via try-catch, so the practical impact is limited. The `withLimit` method also throws for `UNSUPPORTED`, so the behavior is somewhat consistent.	Low
	Strengthen test assertions for sort correctness The test verifies that consecutive sorts honor the latest sort direction (`sort` `gender \| sort - gender`), expecting descending order. However, the test only checks the order of rows but doesn't verify that the final sort is indeed descending (M before F). Consider also asserting that the count values are correct per gender to make the test more robust and meaningful. integ-test/src/yamlRestTest/resources/rest-api-spec/test/issues/5125.yml [66] +- match: { total: 2 } +- match: { schema: [ { name: c, type: bigint }, { name: gender, type: string } ] } - match: { datarows: [ [ 4, "M" ], [ 3, "F" ] ] } +- gt: { datarows.0.0: 0 } Suggestion importance[1-10]: 2 __ Why: The test already verifies the exact `datarows` content including both count values and gender ordering (`[ [ 4, "M" ], [ 3, "F" ] ]`), which implicitly confirms descending sort order and correct counts. The `improved_code` adds a redundant `gt` assertion that doesn't meaningfully improve coverage, and the suggestion mischaracterizes what the existing assertions already validate.	Low

Previous suggestions

Suggestions up to commit af6c695

Category	Suggestion	Impact
Possible issue	Fix unreachable throw inside catch block The `throw` inside the `catch` block will never be reached because the exception is caught and then immediately re-thrown as a different exception type, but the logic is actually correct — however, the `return` after `calendarIntervalSetter.accept(...)` is unreachable since the throw follows it. More critically, the `throw` is inside the `catch` block for the calendar interval, meaning a failure to get the calendar interval will throw `PushDownUnSupportedException`, which is correct, but the `return` statement before the closing brace of the `try` is unreachable. This is a logic error in `AggPushDownAction.copyDateHistogramInterval` — the `throw` should be outside both `catch` blocks. opensearch/src/main/java/org/opensearch/sql/opensearch/storage/scan/context/AggSpec.java [298-304] try { calendarIntervalSetter.accept(source.getIntervalAsCalendar()); return; } catch (IllegalArgumentException \| IllegalStateException ignored) { - throw new OpenSearchRequestBuilder.PushDownUnSupportedException( - "Cannot copy interval for date histogram bucket " + source.name()); + // Both fixed and calendar interval failed. } +throw new OpenSearchRequestBuilder.PushDownUnSupportedException( + "Cannot copy interval for date histogram bucket " + source.name()); Suggestion importance[1-10]: 7 __ Why: The `throw` inside the `catch` block for the calendar interval is indeed inside the catch, meaning it correctly handles the failure case. However, the `return` before the closing brace of the inner `try` is unreachable after the throw. The suggestion correctly identifies that the throw should be outside both catch blocks for clarity and correctness, though the current code does functionally throw when both intervals fail.	Medium
General	Ensure plugin setting reset on teardown failure The teardown disables the Calcite plugin after deleting the index, but if the index deletion fails, the plugin setting may not be reset. Consider placing the settings reset before the index deletion to ensure the plugin is always disabled regardless of index deletion outcome. integ-test/src/yamlRestTest/resources/rest-api-spec/test/issues/5125.yml [38-47] teardown: - - do: - indices.delete: - index: issue5125 - ignore_unavailable: true - do: query.settings: body: transient: plugins.calcite.enabled: false + - do: + indices.delete: + index: issue5125 + ignore_unavailable: true Suggestion importance[1-10]: 4 __ Why: The suggestion to reorder teardown steps so `plugins.calcite.enabled` is reset before index deletion is a reasonable defensive practice, but the impact is low since `ignore_unavailable: true` already handles index deletion failures gracefully, and the plugin setting reset is unlikely to be critical in test teardown scenarios.	Low
	Cache aggregation build result to avoid repeated recomputation `aggSpec.build()` is called every time `createRequestBuilder()` is invoked, which re-runs `AggregateAnalyzer.analyze` and all push-down operations on each call. If `createRequestBuilder()` is called multiple times (e.g., during planning), this could produce inconsistent results or be expensive. Consider caching the result of `aggSpec.build()` or ensuring `createRequestBuilder()` is only called once. opensearch/src/main/java/org/opensearch/sql/opensearch/storage/scan/context/PushDownContext.java [213-216] if (aggSpec != null) { - newRequestBuilder.pushDownAggregation(aggSpec.build()); + Pair<List<AggregationBuilder>, OpenSearchAggregationResponseParser> built = aggSpec.build(); + newRequestBuilder.pushDownAggregation(built); newRequestBuilder.pushTypeMapping(aggSpec.getExtendedTypeMapping()); } Suggestion importance[1-10]: 3 __ Why: The suggestion raises a valid concern about repeated calls to `aggSpec.build()`, but the `improved_code` only stores the result in a local variable within the same method call — it doesn't actually cache across multiple `createRequestBuilder()` invocations. The improvement is minimal and doesn't address the root concern.	Low
	Avoid unnecessary object instantiation for script count In `AggSpec.create`, a new `AggPushDownAction` is instantiated solely to call `getScriptCount()`. This is wasteful and couples `AggSpec` to `AggPushDownAction`'s internal state. The script count should be computed directly from the `builderAndParser` using a static utility method or inline logic, avoiding the unnecessary object creation. opensearch/src/main/java/org/opensearch/sql/opensearch/storage/scan/context/AggSpec.java [160] -new AggPushDownAction(builderAndParser, extendedTypeMapping, bucketNames).getScriptCount(), +AggPushDownAction.getScriptCount(builderAndParser), Suggestion importance[1-10]: 2 __ Why: The suggestion proposes calling a static `getScriptCount` method on `AggPushDownAction`, but no such static method exists in the PR diff — `getScriptCount` is an instance method. The `improved_code` references a non-existent API, making this suggestion incorrect as written.	Low

Suggestions up to commit 11db491

Category	Suggestion	Impact
General	Preserve root cause in exception re-throw In the second `catch` block, the `ignored` variable shadows the outer `ignored` variable from the first `catch` block. While this compiles, it is confusing and may hide the actual exception. The inner catch should re-throw the `PushDownUnSupportedException` with the caught exception as the cause so the root cause is not lost. opensearch/src/main/java/org/opensearch/sql/opensearch/storage/scan/context/AggPushDownAction.java [287-304] private static void copyDateHistogramInterval( DateHistogramValuesSourceBuilder source, Consumer<DateHistogramInterval> fixedIntervalSetter, Consumer<DateHistogramInterval> calendarIntervalSetter) { try { fixedIntervalSetter.accept(source.getIntervalAsFixed()); return; } catch (IllegalArgumentException \| IllegalStateException ignored) { // Fallback to calendar interval. } try { calendarIntervalSetter.accept(source.getIntervalAsCalendar()); - return; - } catch (IllegalArgumentException \| IllegalStateException ignored) { + } catch (IllegalArgumentException \| IllegalStateException e) { throw new OpenSearchRequestBuilder.PushDownUnSupportedException( - "Cannot copy interval for date histogram bucket " + source.name()); + "Cannot copy interval for date histogram bucket " + source.name(), e); } } Suggestion importance[1-10]: 5 __ Why: The suggestion correctly identifies that the caught exception in the second `catch` block is discarded, losing the root cause. Preserving the cause via `PushDownUnSupportedException(message, e)` improves debuggability, and the new constructor was added in this PR. The variable shadowing concern is also valid.	Low
General	Reset stale sort collations after rare-top rewrite When `withRareTop` is called on an `AggSpec` that already has `kind == AggKind.RARE_TOP` (e.g., if called twice), the guard `kind != AggKind.COMPOSITE` will throw an exception even though the transition might be valid. Additionally, the `bucketSortCollations` from the current spec are passed through, but after a `RARE_TOP` rewrite the composite bucket structure changes, making the previous sort collations potentially invalid. Consider resetting `bucketSortCollations` and `bucketSortFieldNames` to `null` in the returned spec. opensearch/src/main/java/org/opensearch/sql/opensearch/storage/scan/context/AggSpec.java [306-333] public AggSpec withRareTop(RareTopDigest digest) { if (kind != AggKind.COMPOSITE \|\| !rareTopSupported) { throw new OpenSearchRequestBuilder.PushDownUnSupportedException("Cannot pushdown " + digest); } return new AggSpec( - ... + aggregate, + project, + outputFields, + rowType, + fieldTypes, + cluster, + bucketNullable, + queryBucketSize, + extendedTypeMapping, + initialBucketNames, + bucketNames, + scriptCount, AggKind.RARE_TOP, LimitPushdownMode.BUCKET_SIZE, - ... + null, + rareTopSupported, + null, + null, + measureSortCollations, + measureSortFieldNames, + digest, digest.byList().isEmpty() ? digest.number() : DEFAULT_MAX_BUCKETS); } Suggestion importance[1-10]: 4 __ Why: The suggestion raises a valid concern that `bucketSortCollations` from a composite aggregation may be stale after a `RARE_TOP` rewrite. However, the `improved_code` doesn't match the `existing_code` accurately (it uses `...` placeholders in the existing code), making it hard to verify the exact change. The concern about resetting collations is reasonable but the impact depends on whether `withRareTop` can be called after a bucket sort has been applied.	Low
Possible issue	Handle missing enum case in switch expression The `LEAF_METRIC` case is missing from the `inferLimitPushdownMode(AggKind)` switch expression. If `AggKind.LEAF_METRIC` is ever passed to this method (e.g., via `withSortMeasure`), the switch will throw an `IllegalArgumentException` at runtime because the enum value is not handled. Add a case for `LEAF_METRIC` to make the switch exhaustive. opensearch/src/main/java/org/opensearch/sql/opensearch/storage/scan/context/AggSpec.java [429-434] private static LimitPushdownMode inferLimitPushdownMode(AggKind kind) { return switch (kind) { case COMPOSITE, TERMS, MULTI_TERMS, TOP_HITS, RARE_TOP -> LimitPushdownMode.BUCKET_SIZE; case OTHER, DATE_HISTOGRAM, HISTOGRAM -> LimitPushdownMode.UNSUPPORTED; + case LEAF_METRIC -> LimitPushdownMode.LEAF_METRIC; }; } Suggestion importance[1-10]: 2 __ Why: The `AggKind` enum does not have a `LEAF_METRIC` value - looking at the enum definition in the file (lines 41-50), the values are `OTHER`, `COMPOSITE`, `TERMS`, `MULTI_TERMS`, `DATE_HISTOGRAM`, `HISTOGRAM`, `TOP_HITS`, and `RARE_TOP`. The switch is already exhaustive over all defined enum values, so the suggestion's premise is incorrect.	Low

Suggestions up to commit 84892bc

Suggestions up to commit f36c20a

Category	Suggestion	Impact
General	Fix unreachable return after calendar interval setter The `throw` inside the `catch` block will never be reached because the exception is caught and then the throw is inside the same catch block — the throw is actually the fallback when the calendar interval also fails, but it's placed inside the catch for the calendar interval, which means it only executes when `getIntervalAsCalendar()` throws. This logic is correct, but the `return` statement after `calendarIntervalSetter.accept(...)` is unreachable if an exception is thrown. However, the real issue is that if `getIntervalAsCalendar()` succeeds but then `calendarIntervalSetter.accept()` throws, the exception will be swallowed and re-thrown as `PushDownUnSupportedException`. The `return` after the setter call should be outside the catch block to avoid this confusion. Restructure so the throw only happens when both intervals fail. opensearch/src/main/java/org/opensearch/sql/opensearch/storage/scan/context/AggPushDownAction.java [297-303] try { calendarIntervalSetter.accept(source.getIntervalAsCalendar()); - return; } catch (IllegalArgumentException \| IllegalStateException ignored) { throw new OpenSearchRequestBuilder.PushDownUnSupportedException( "Cannot copy interval for date histogram bucket " + source.name()); } Suggestion importance[1-10]: 5 __ Why: The suggestion correctly identifies that the `return` statement inside the try block after `calendarIntervalSetter.accept()` is redundant/confusing, and the improved code removes it. However, the logic is functionally equivalent since the `return` after the setter is only reached if no exception is thrown, and the catch block only executes on exception. The improved code is slightly cleaner but the original is not actually broken.	Low
	Avoid repeated aggregation rebuild on each request creation The `aggSpec.build()` call in `createRequestBuilder()` re-runs `AggregateAnalyzer.analyze` and replays all push-down operations every time a request builder is created. If `createRequestBuilder()` is called multiple times (e.g., for retries or pagination), this could cause performance issues or subtle bugs if the analysis is non-deterministic. Consider caching the result of `aggSpec.build()` or ensuring it is only called once. opensearch/src/main/java/org/opensearch/sql/opensearch/storage/scan/context/PushDownContext.java [213-216] if (aggSpec != null) { - newRequestBuilder.pushDownAggregation(aggSpec.build()); + Pair<List<AggregationBuilder>, OpenSearchAggregationResponseParser> builtAgg = aggSpec.build(); + newRequestBuilder.pushDownAggregation(builtAgg); newRequestBuilder.pushTypeMapping(aggSpec.getExtendedTypeMapping()); } Suggestion importance[1-10]: 4 __ Why: The suggestion raises a valid concern about `aggSpec.build()` being called multiple times, potentially causing performance issues. However, the `improved_code` only extracts the result to a local variable within the same method call, which doesn't actually cache it across multiple `createRequestBuilder()` invocations, making the fix incomplete for the stated concern.	Low
	Ensure plugin setting reset runs first in teardown The teardown disables the Calcite plugin after deleting the index, but if the index deletion fails, the plugin setting may not be reset. Consider placing the settings reset before the index deletion to ensure the plugin is always disabled regardless of index deletion success. integ-test/src/yamlRestTest/resources/rest-api-spec/test/issues/5125.yml [38-47] teardown: - - do: - indices.delete: - index: issue5125 - ignore_unavailable: true - do: query.settings: body: transient: plugins.calcite.enabled: false + - do: + indices.delete: + index: issue5125 + ignore_unavailable: true Suggestion importance[1-10]: 4 __ Why: The suggestion has merit as resetting the `plugins.calcite.enabled` setting before deleting the index ensures cleanup happens regardless of index deletion success. However, since `ignore_unavailable: true` is already set on the index deletion, failures are already handled gracefully, making this a minor ordering improvement rather than a critical fix.	Low
	Document ordering constraint for withoutBucketSort usage In `withoutBucketSort()`, the `bucketNames` field is reset to `initialBucketNames`, which correctly reverts the bucket ordering. However, `measureSortCollations` and `measureSortFieldNames` are preserved in the new `AggSpec`. If a measure sort was applied after a bucket sort, removing the bucket sort should not affect the measure sort, so this is fine. But the `kind` field is not reset — if `withSortMeasure` was called and changed `kind` to `measureSortTarget`, calling `withoutBucketSort` afterward would retain the changed kind. Verify that `withoutBucketSort` is only called before `withSortMeasure` in the planning pipeline, or reset `kind` appropriately. opensearch/src/main/java/org/opensearch/sql/opensearch/storage/scan/context/AggSpec.java [223-249] public AggSpec withoutBucketSort() { if (bucketSortCollations == null) { return this; } + // Note: kind is not reset here; withoutBucketSort should only be called + // before withSortMeasure to avoid stale kind state. return new AggSpec( - ... + aggregate, + project, + outputFields, + rowType, + fieldTypes, + cluster, + bucketNullable, + queryBucketSize, + extendedTypeMapping, initialBucketNames, initialBucketNames, - ... + scriptCount, + kind, + measureSortTarget, + rareTopSupported, + null, + null, + measureSortCollations, + measureSortFieldNames, + rareTopDigest, + bucketSize); +} Suggestion importance[1-10]: 2 __ Why: The suggestion only adds a comment to the existing code without changing any logic, and the `improved_code` is essentially the same as the `existing_code` with a comment added. This is a documentation-only suggestion with minimal impact.	Low

Suggestions up to commit 7df5f94

Category	Suggestion	Impact
General	Clear bucket sort when transitioning to rare-top aggregation In `withRareTop`, the new `AggSpec` is constructed with `kind` set to `AggKind.RARE_TOP` and `measureSortTarget` set to `null`. However, `bucketSortCollations` and `bucketSortFieldNames` are preserved from the current spec. If a bucket sort was previously applied, it would be carried over into the rare-top spec and re-applied during `build()`. This may produce incorrect query behavior since rare-top aggregations don't support composite bucket sort ordering. Consider clearing `bucketSortCollations` and `bucketSortFieldNames` when transitioning to `RARE_TOP`. opensearch/src/main/java/org/opensearch/sql/opensearch/storage/scan/context/AggSpec.java [285-311] public AggSpec withRareTop(RareTopDigest digest) { if (kind != AggKind.COMPOSITE \|\| !rareTopSupported) { throw new OpenSearchRequestBuilder.PushDownUnSupportedException("Cannot pushdown " + digest); } return new AggSpec( - ... + aggregate, + project, + outputFields, + rowType, + fieldTypes, + cluster, + bucketNullable, + queryBucketSize, + extendedTypeMapping, + initialBucketNames, + bucketNames, + scriptCount, + AggKind.RARE_TOP, + null, + rareTopSupported, + null, // clear bucket sort collations for rare-top + null, // clear bucket sort field names for rare-top + measureSortCollations, + measureSortFieldNames, + digest, digest.byList().isEmpty() ? digest.number() : DEFAULT_MAX_BUCKETS); } Suggestion importance[1-10]: 5 __ Why: The suggestion raises a valid concern that `bucketSortCollations` and `bucketSortFieldNames` are preserved when transitioning to `RARE_TOP`, which could cause incorrect behavior during `build()` since rare-top aggregations don't support composite bucket sort. However, the actual code in `withRareTop` does pass `bucketSortCollations` and `bucketSortFieldNames` through, so this is a real potential issue worth addressing.	Low
	Set aggSpec before adding operations in clone The `clone()` method iterates over `this` (which uses the `queue` iterator) and calls `newContext.add(operation)` for each operation. The `add` method calls `operation.action().pushOperation(this, operation)` which may have side effects on `newContext` (e.g., setting `isAggregatePushed`, updating `startFrom`). However, `aggSpec` is set after all operations are added, which means if any operation's `pushOperation` tries to access `aggSpec` on `newContext`, it would be null. Since `aggSpec` is now set separately from the AGGREGATION operation's `add()` call (the action is a no-op lambda), this ordering should be safe. But it's fragile — consider setting `aggSpec` before adding operations or documenting this dependency. opensearch/src/main/java/org/opensearch/sql/opensearch/storage/scan/context/PushDownContext.java [51-58] @Override public PushDownContext clone() { PushDownContext newContext = new PushDownContext(osIndex); + newContext.aggSpec = aggSpec; // Set before adding operations to avoid null access for (PushDownOperation operation : this) { newContext.add(operation); } - newContext.aggSpec = aggSpec; return newContext; } Suggestion importance[1-10]: 4 __ Why: The suggestion to set `aggSpec` before iterating operations in `clone()` is a valid defensive improvement to avoid potential null access if any `pushOperation` side effect ever accesses `aggSpec`. However, since the AGGREGATION action is a no-op lambda in the new design, the risk is currently low, making this a minor maintainability improvement.	Low
	Fix teardown order to reset settings before deleting index The teardown disables the Calcite plugin, but if the test fails mid-execution, the plugin setting may not be reset. Consider also adding the settings reset to the `setup` block or ensuring the teardown order is correct (delete index first, then reset settings) to avoid leaving the cluster in a bad state for subsequent tests. integ-test/src/yamlRestTest/resources/rest-api-spec/test/issues/5125.yml [38-47] teardown: - - do: - indices.delete: - index: issue5125 - ignore_unavailable: true - do: query.settings: body: transient: plugins.calcite.enabled: false + - do: + indices.delete: + index: issue5125 + ignore_unavailable: true Suggestion importance[1-10]: 3 __ Why: The suggestion to reorder teardown steps (reset settings before deleting index) is a minor improvement for test hygiene, but the original order (delete index first, then reset settings) is also valid and commonly used. The teardown block runs regardless of test failure in YAML REST tests, so the concern about "failing mid-execution" is less critical here.	Low

Signed-off-by: Songkan Tang <songkant@amazon.com>

github-actions · 2026-03-06T06:40:49Z

Persistent review updated to latest commit 21f8940

LantaoJin · 2026-03-10T06:29:52Z

cc @qianheng-aws as proper reviewer

qianheng-aws

Have you ever tried lazy push-down for AggregationBuilderAction? If that approach can work, we don't need to create AggregationBuilder in the planning phase, neither need to do deep copy for AggregationBuilder then.

qianheng-aws · 2026-03-12T08:28:00Z

...arch/src/main/java/org/opensearch/sql/opensearch/storage/scan/context/AggPushDownAction.java

+    if (builder instanceof NestedAggregationBuilder nested) {
+      return copyNestedAggregationBuilder(nested);
+    }
+    return builder;


What if there is other missed or upcoming agg builder?

qianheng-aws · 2026-03-12T08:30:55Z

...arch/src/main/java/org/opensearch/sql/opensearch/storage/scan/context/AggPushDownAction.java

+    }
+  }
+
+  private static final class TermsAggregationBuilderCopy extends TermsAggregationBuilder {


Can this be replaced with

source.shallowCopy(source, copySubAggregations(source), copyMetadataOrNull(source))

instead of defining these extended class?

Signed-off-by: Songkan Tang <songkant@amazon.com>

github-actions · 2026-03-16T09:11:34Z

Persistent review updated to latest commit 9fe5192

Signed-off-by: Songkan Tang <songkant@amazon.com>

github-actions · 2026-03-17T03:04:28Z

Persistent review updated to latest commit 7df5f94

Signed-off-by: Songkan Tang <songkant@amazon.com>

github-actions · 2026-03-17T03:14:09Z

Persistent review updated to latest commit f36c20a

Signed-off-by: Songkan Tang <songkant@amazon.com>

github-actions · 2026-03-20T09:07:03Z

Persistent review updated to latest commit 84892bc

github-actions · 2026-03-23T06:50:11Z

Persistent review updated to latest commit 11db491

github-actions · 2026-03-23T08:05:17Z

Persistent review updated to latest commit af6c695

songkant-aws · 2026-03-23T08:07:03Z

@qianheng-aws Now the fix is changed to lazily push down agg request at the createRequest phase. But it introduces some intermediate immutable spec object to record any logical pushdown operations. It touches more logic now but I think it could mitigate the current problem of AggPushDownAction's mutation during planning.

Signed-off-by: Songkan Tang <songkant@amazon.com>

github-actions · 2026-03-23T08:21:46Z

Persistent review updated to latest commit 05a2ed2

qianheng-aws

As commented here:https://github.com/opensearch-project/sql/pull/5199/changes#r2980342004, if the AggPushDownAction can be reusable, plenty of params can be removed, which are maintained all the way but only for creating AggPushDownAction in the end in build() method.

qianheng-aws · 2026-03-24T09:19:53Z

opensearch/src/main/java/org/opensearch/sql/opensearch/storage/scan/context/AggSpec.java

+
+  private static LimitPushdownMode inferLimitPushdownMode(
+      List<AggregationBuilder> builders, @Nullable AggregationBuilder rootBuilder) {
+    if (builders.isEmpty()) {


Nit: seems no need to pass in param builders here. This can be replaced with rootBuilder==null.

Yes, replace it with rootBuilder == null check now.

qianheng-aws · 2026-03-24T09:21:16Z

opensearch/src/main/java/org/opensearch/sql/opensearch/storage/scan/context/AggSpec.java

+      // Treats leaf metric aggregations as limit-pushable because they produce a single row.
+      return LimitPushdownMode.LEAF_METRIC;
+    }
+    return inferLimitPushdownMode(inferKind(rootBuilder));


Nit: pass in kind to avoid calling inferKind method here, it's already evaluated before calling inferLimitPushdownMode.

qianheng-aws · 2026-03-24T09:25:51Z

opensearch/src/main/java/org/opensearch/sql/opensearch/storage/scan/context/AggSpec.java

+        extendedTypeMapping,
+        bucketNames,
+        bucketNames,
+        new AggPushDownAction(builderAndParser, extendedTypeMapping, bucketNames).getScriptCount(),


It's weird to new a AggPushDownAction here only to get the script count. Replace it with builderAndParser.getLeft().stream().mapToInt(AggPushDownAction::getScriptCount).sum()

Done. It's now calling getScriptCount instead

qianheng-aws · 2026-03-24T09:33:37Z

opensearch/src/main/java/org/opensearch/sql/opensearch/storage/scan/context/AggSpec.java

+        bucketNames,
+        scriptCount,
+        measureSortTarget,
+        inferLimitPushdownMode(measureSortTarget),


Why invoking infer again while measureSortTarget is unchanged here?

This was incorrect. Now it will infer rewriteTarget in local method and based on rewritten AggKind to inferLimitPushdownMode again.

qianheng-aws · 2026-03-24T09:54:46Z

opensearch/src/main/java/org/opensearch/sql/opensearch/storage/scan/context/AggSpec.java

+      AggregateAnalyzer.AggregateBuilderHelper helper =
+          new AggregateAnalyzer.AggregateBuilderHelper(
+              rowType, fieldTypes, cluster, bucketNullable, queryBucketSize);
+      Pair<List<AggregationBuilder>, OpenSearchAggregationResponseParser> builderAndParser =


Why should we analyze the aggregate again? Can we reuse the builderAndParser generated when pushing down the aggregate? Even more, we may be able to reuse the AggPushDownAction.

Since we pushdown everything lazily until this build() method, which will only be invoked once. It should be safe to reuse.

As discussed offline. Ideally if we don't care about afterKey rewrite in agg pagination, then the builderAndParser can be reusable. I change the behavior to reuse passed builderAndParser to AggSpec.

However, I reviewed the whole process. I still think we still need AggSpec like object to hold planning stage agg state immutable copies(can't fully AggPushDownAction). AggPushDownAction is used at the final materialization of replayed PushDownOperations for agg. It will be better to separate them into two different objects. During optimizer planning stage, we still need to record some agg state changes in different RelSubsets. One for planning side agg state immutable objects, the other one is for replaying final operations to materialize final AggregationBuilder(lazily generate request builder).

qianheng-aws · 2026-03-24T10:01:15Z

As commented here:https://github.com/opensearch-project/sql/pull/5199/changes#r2980342004, if the AggPushDownAction can be reusable, plenty of params can be removed, which are maintained all the way but only for creating AggPushDownAction in the end in build() method.

As far as I can see, only these params are getting changed in the push down process for agg.

bucketNames, kind, limitPushdownMode, bucketSortCollations, bucketSortFieldNames, digest, bucketSize

Signed-off-by: Songkan Tang <songkant@amazon.com>

github-actions · 2026-03-26T07:31:53Z

Failed to generate code suggestions for PR

songkant-aws · 2026-03-26T09:12:50Z

@qianheng-aws Now I have streamlined the AggSpec. And choose not to reuse AggPushDownAction because we need several immutable objects copies in different RelSubsets to represent different logical agg states per RelSubset. However, replaying pushdown operations to change internal agg parser is a one-time mutable state change, which is better lazily built at last.

Fix PushDownContext shallow copy bug

62df8d2

Signed-off-by: Songkan Tang <songkant@amazon.com>

Fix test issues

21f8940

Signed-off-by: Songkan Tang <songkant@amazon.com>

songkant-aws marked this pull request as ready for review March 9, 2026 07:47

LantaoJin added the bugFix label Mar 10, 2026

qianheng-aws reviewed Mar 12, 2026

View reviewed changes

Refactor aggregation pushdown state to use immutable AggSpec

9fe5192

Signed-off-by: Songkan Tang <songkant@amazon.com>

songkant-aws requested a review from ahkcs as a code owner March 16, 2026 09:08

Fix tests

7df5f94

Signed-off-by: Songkan Tang <songkant@amazon.com>

Fix spotless check

f36c20a

Signed-off-by: Songkan Tang <songkant@amazon.com>

qianheng-aws mentioned this pull request Mar 18, 2026

Add songkant-aws as maintainer #5244

Merged

Keep the pushdown limit logic as before

84892bc

Signed-off-by: Songkan Tang <songkant@amazon.com>

Merge branch 'main' into pushdown-action-shallow-copy-bugfix

11db491

Minor change to reduce unnecesary changes

05a2ed2

Signed-off-by: Songkan Tang <songkant@amazon.com>

songkant-aws force-pushed the pushdown-action-shallow-copy-bugfix branch from af6c695 to 05a2ed2 Compare March 23, 2026 08:19

qianheng-aws reviewed Mar 24, 2026

View reviewed changes

Address comments

c2bd684

Signed-off-by: Songkan Tang <songkant@amazon.com>

Conversation

songkant-aws commented Mar 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Related Issues

Check List

Uh oh!

github-actions bot commented Mar 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Reviewer Guide 🔍

(Review updated until commit 05a2ed2)

Uh oh!

github-actions bot commented Mar 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Code Suggestions ✨

Previous suggestions

Uh oh!

github-actions bot commented Mar 6, 2026

Uh oh!

LantaoJin commented Mar 10, 2026

Uh oh!

qianheng-aws left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Mar 16, 2026

Uh oh!

github-actions bot commented Mar 17, 2026

Uh oh!

github-actions bot commented Mar 17, 2026

Uh oh!

github-actions bot commented Mar 20, 2026

Uh oh!

github-actions bot commented Mar 23, 2026

Uh oh!

github-actions bot commented Mar 23, 2026

Uh oh!

songkant-aws commented Mar 23, 2026

Uh oh!

github-actions bot commented Mar 23, 2026

Uh oh!

qianheng-aws left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

songkant-aws Mar 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

qianheng-aws commented Mar 24, 2026

Uh oh!

github-actions bot commented Mar 26, 2026

Uh oh!

songkant-aws commented Mar 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

songkant-aws commented Mar 4, 2026 •

edited

Loading

github-actions bot commented Mar 4, 2026 •

edited

Loading

(Review updated until commit `05a2ed2`)

github-actions bot commented Mar 4, 2026 •

edited

Loading

qianheng-aws left a comment •

edited

Loading

songkant-aws Mar 26, 2026 •

edited

Loading

songkant-aws commented Mar 26, 2026 •

edited

Loading