Add AnalyticsExecutionEngine and query routing for Parquet-backed indices by ahkcs · Pull Request #5258 · opensearch-project/sql

ahkcs · 2026-03-23T23:30:00Z

Summary

Implements the execution engine adapter and query routing infrastructure for Project Mustang's unified query pipeline (#5247).

QueryPlanExecutor — @FunctionalInterface contract for analytics engine plan execution
AnalyticsExecutionEngine — Implements ExecutionEngine, bridges QueryPlanExecutor output to QueryResponse pipeline
RestUnifiedQueryAction — Orchestrates the analytics query path: routing, planning via UnifiedQueryPlanner, execution, response formatting on sql-worker thread pool
RestPPLQueryAction — Routing branch: parquet_ prefixed indices → analytics path; all others → existing Lucene path
StubQueryPlanExecutor — Canned data for parquet_logs and parquet_metrics tables for development/testing
AnalyticsPPLIT — Integration tests validating full pipeline end-to-end

Test plan

12 unit tests for AnalyticsExecutionEngine (type mapping, row conversion, query size limit, null handling, error propagation, explain)
10 unit tests for RestUnifiedQueryAction (index name extraction, routing detection)
6 integration tests (AnalyticsPPLIT) — full pipeline with stub executor against real test cluster
Full regression test suite

…ices (opensearch-project#5247) Implements the execution engine adapter and query routing infrastructure for Project Mustang's unified query pipeline (Option B). PPL queries targeting parquet_ prefixed indices are routed through UnifiedQueryPlanner and AnalyticsExecutionEngine instead of the existing Lucene path. Key changes: - QueryPlanExecutor interface: contract for analytics engine execution - AnalyticsExecutionEngine: converts QueryPlanExecutor results to QueryResponse - RestUnifiedQueryAction: orchestrates schema building, planning, execution - RestPPLQueryAction: routing branch for parquet_ indices - StubQueryPlanExecutor: canned data for development/testing Signed-off-by: Kai Huang <kaihuang@amazon.com> Signed-off-by: Kai Huang <ahkcs@amazon.com>

github-actions · 2026-03-23T23:31:16Z

PR Reviewer Guide 🔍

(Review updated until commit `ac9ca64`)

Here are some key observations to aid the review process:

🧪 PR contains tests
🔒 No security concerns identified
📝 TODO sections plugin/src/main/java/org/opensearch/sql/plugin/rest/RestUnifiedQueryAction.java [129]: Replace stub schema with EngineContext.getSchema() when analytics engine is ready
🔀 Multiple PR themes Sub-PR theme: Core AnalyticsExecutionEngine and QueryPlanExecutor interface Relevant files: core/src/main/java/org/opensearch/sql/executor/analytics/AnalyticsExecutionEngine.java core/src/main/java/org/opensearch/sql/executor/analytics/QueryPlanExecutor.java core/src/test/java/org/opensearch/sql/executor/analytics/AnalyticsExecutionEngineTest.java Sub-PR theme: Plugin-level routing, stub executor, and integration tests Relevant files: plugin/src/main/java/org/opensearch/sql/plugin/rest/RestUnifiedQueryAction.java plugin/src/main/java/org/opensearch/sql/plugin/rest/StubQueryPlanExecutor.java plugin/src/main/java/org/opensearch/sql/plugin/rest/RestPPLQueryAction.java plugin/src/main/java/org/opensearch/sql/plugin/SQLPlugin.java plugin/src/test/java/org/opensearch/sql/plugin/rest/RestUnifiedQueryActionTest.java integ-test/src/test/java/org/opensearch/sql/ppl/AnalyticsPPLIT.java
⚡ Recommended focus areas for review JSON Injection The `reportError` method manually constructs a JSON string by embedding `e.getClass().getSimpleName()` directly without sanitization. A class name could theoretically contain characters that break JSON structure. More critically, the manual escaping of `reason` is incomplete — it does not escape tab characters (`\t`) or other control characters (Unicode control chars), which can produce invalid JSON. Use a proper JSON serializer (e.g., Jackson or Gson) instead of string concatenation. private static void reportError(RestChannel channel, Exception e) { RestStatus status = isClientError(e) ? RestStatus.BAD_REQUEST : RestStatus.INTERNAL_SERVER_ERROR; String reason = e.getMessage() != null ? e.getMessage() : "Unknown error"; // Escape characters that would break JSON reason = reason.replace("\\", "\\\\").replace("\"", "\\\"").replace("\n", "\\n").replace("\r", ""); channel.sendResponse( new BytesRestResponse( status, "application/json; charset=UTF-8", "{\"error\":{\"type\":\"" + e.getClass().getSimpleName() + "\",\"reason\":\"" + reason + "\"},\"status\":" + status.getStatus() + "}")); } NullPointerException as Client Error `isClientError` classifies `NullPointerException` as a client error (HTTP 400). NPEs are almost always programming bugs (server errors), not malformed user input. This will mislead users and hide server-side bugs by returning 400 instead of 500. private static boolean isClientError(Exception e) { return e instanceof SyntaxCheckException \|\| e instanceof SemanticCheckException \|\| e instanceof IllegalArgumentException \|\| e instanceof NullPointerException; } Stub in Production Code `StubQueryPlanExecutor` is placed in `plugin/src/main/java` (production source set) and wired directly into `RestPPLQueryAction` via `new StubQueryPlanExecutor()`. This means stub/canned data ships in the production plugin artifact. It should be in a test source set or behind a feature flag to prevent accidental exposure in production deployments. public class StubQueryPlanExecutor implements QueryPlanExecutor { @Override public Iterable<Object[]> execute(RelNode plan, Object context) { // Return canned rows matching the stub schema defined in RestUnifiedQueryAction. // The column order must match the schema: ts, status, message, ip_addr // (for parquet_logs table). For other tables, return empty results. String tableName = extractTableName(plan); if (tableName != null && tableName.contains("parquet_logs")) { return List.of( new Object[] { Instant.parse("2024-01-15T10:30:00Z"), 200, "Request completed", "192.168.1.1" }, new Object[] { Instant.parse("2024-01-15T10:31:00Z"), 200, "Health check OK", "192.168.1.2" }, new Object[] { Instant.parse("2024-01-15T10:32:00Z"), 500, "Internal server error", "192.168.1.3" }); } if (tableName != null && tableName.contains("parquet_metrics")) { return List.of( new Object[] {Instant.parse("2024-01-15T10:30:00Z"), 75.5, 8192.5, "host-1"}, new Object[] {Instant.parse("2024-01-15T10:31:00Z"), 82.3, 7680.5, "host-2"}); } return List.of(); } private String extractTableName(RelNode plan) { // Use RelOptUtil.toString to get the full plan tree including child nodes String planStr = RelOptUtil.toString(plan); if (planStr.contains("parquet_logs")) { return "parquet_logs"; } if (planStr.contains("parquet_metrics")) { return "parquet_metrics"; } return null; } Routing Bypass Risk `isUnifiedQueryPath` uses a simple regex on the raw query string to detect `parquet_` prefix. A query like `source = my_index \| where message = 'parquet_logs'` would not be misrouted (the regex matches the source clause), but a query like `source = parquet_logs_backup` would be routed to the analytics engine even if it's a regular Lucene index. The prefix check is applied to the full table name segment, so `parquet_logs_backup` would match. This could cause unexpected failures for any Lucene index whose name starts with `parquet_`. public static boolean isUnifiedQueryPath(String query) { if (query == null) { return false; } String indexName = extractIndexName(query); if (indexName == null) { return false; } // Handle qualified names like "catalog.parquet_logs" — check the last segment int lastDot = indexName.lastIndexOf('.'); String tableName = lastDot >= 0 ? indexName.substring(lastDot + 1) : indexName; return tableName.startsWith("parquet_"); } Null Context Passed In `execute(RelNode, CalcitePlanContext, ResponseListener)`, `planExecutor.execute(plan, null)` always passes `null` as the execution context. The `QueryPlanExecutor` interface documents the second parameter as "execution context (opaque to avoid server dependency)". Passing `null` unconditionally means the real analytics engine (when substituted) will receive no context, likely causing a NullPointerException or incorrect behavior at runtime. Iterable<Object[]> rows = planExecutor.execute(plan, null);

github-actions · 2026-03-23T23:31:53Z

PR Code Suggestions ✨

Latest suggestions up to ac9ca64

Explore these optional code suggestions:

Category	Suggestion	Impact
Possible issue	Remove NullPointerException from client error classification `NullPointerException` is a programming error (server bug), not a client error. Classifying it as a client error will return HTTP 400 for internal null pointer bugs, masking server-side issues and making debugging harder. Remove `NullPointerException` from the client error classification. plugin/src/main/java/org/opensearch/sql/plugin/rest/RestUnifiedQueryAction.java [262-267] private static boolean isClientError(Exception e) { return e instanceof SyntaxCheckException \|\| e instanceof SemanticCheckException - \|\| e instanceof IllegalArgumentException - \|\| e instanceof NullPointerException; + \|\| e instanceof IllegalArgumentException; } Suggestion importance[1-10]: 7 __ Why: `NullPointerException` is a programming/server error, not a client error. Classifying it as a client error (HTTP 400) would mask server-side bugs and make debugging harder. This is a valid correctness concern.	Medium
Security	Fix incomplete JSON escaping in error responses Manual JSON string construction is fragile and prone to injection if the error message contains characters like `<`, `>`, or tab characters that are not escaped. The current escaping also strips `\r` entirely rather than escaping it, which is inconsistent. Use a proper JSON serialization approach or at minimum also escape tab characters (`\t`) and other control characters. plugin/src/main/java/org/opensearch/sql/plugin/rest/RestUnifiedQueryAction.java [284-296] -reason = - reason.replace("\\", "\\\\").replace("\"", "\\\"").replace("\n", "\\n").replace("\r", ""); +reason = reason + .replace("\\", "\\\\") + .replace("\"", "\\\"") + .replace("\n", "\\n") + .replace("\r", "\\r") + .replace("\t", "\\t"); channel.sendResponse( new BytesRestResponse( status, "application/json; charset=UTF-8", "{\"error\":{\"type\":\"" + e.getClass().getSimpleName() + "\",\"reason\":\"" + reason + "\"},\"status\":" + status.getStatus() + "}")); Suggestion importance[1-10]: 5 __ Why: The manual JSON construction has incomplete escaping (missing `\t` and `\r` is stripped rather than escaped), which could cause malformed JSON. However, the suggestion only adds `\t` escaping and fixes `\r`, which is a minor improvement; a proper JSON library would be a better fix.	Low
General	Fix substring matching order to avoid incorrect table routing The check for `"parquet_logs"` will also match `"parquet_logs_archive"` or any other table name that contains `"parquet_logs"` as a substring, returning incorrect stub data. Use exact matching or check for `"parquet_metrics"` before `"parquet_logs"` since `"parquet_metrics"` does not contain `"parquet_logs"`, but the reverse order is still fragile. plugin/src/main/java/org/opensearch/sql/plugin/rest/StubQueryPlanExecutor.java [49-59] private String extractTableName(RelNode plan) { // Use RelOptUtil.toString to get the full plan tree including child nodes String planStr = RelOptUtil.toString(plan); + if (planStr.contains("parquet_metrics")) { + return "parquet_metrics"; + } if (planStr.contains("parquet_logs")) { return "parquet_logs"; - } - if (planStr.contains("parquet_metrics")) { - return "parquet_metrics"; } return null; } Suggestion importance[1-10]: 4 __ Why: The current order checks `parquet_logs` before `parquet_metrics`, which is fine since `parquet_metrics` doesn't contain `parquet_logs` as a substring. However, reordering to check `parquet_metrics` first is a defensive improvement for a stub class, though the practical impact is low.	Low

Previous suggestions

Suggestions up to commit ccd1665

Category	Suggestion	Impact
Possible issue	Remove NullPointerException from client error classification `NullPointerException` should not be classified as a client error — it is a programming bug (server error), not a user-caused bad query. Including it here will cause server-side bugs to be silently reported as 400 Bad Request, masking real issues. plugin/src/main/java/org/opensearch/sql/plugin/rest/RestUnifiedQueryAction.java [262-267] private static boolean isClientError(Exception e) { return e instanceof SyntaxCheckException \|\| e instanceof SemanticCheckException - \|\| e instanceof IllegalArgumentException - \|\| e instanceof NullPointerException; + \|\| e instanceof IllegalArgumentException; } Suggestion importance[1-10]: 7 __ Why: `NullPointerException` is a programming bug (server error), not a user-caused bad query. Classifying it as a client error will mask real server-side bugs by returning 400 instead of 500, making debugging harder.	Medium
Possible issue	Guard thread pool scheduling against uncaught exceptions If `SQL_WORKER_THREAD_POOL_NAME` does not exist at the time of scheduling (e.g., during startup or in tests), `schedule` may throw an unchecked exception that is not caught, leaving the REST channel without a response and potentially hanging the client. Wrap the scheduling call in a try-catch to ensure the channel always receives a response. plugin/src/main/java/org/opensearch/sql/plugin/rest/RestUnifiedQueryAction.java [116-122] -client - .threadPool() - .schedule( - () -> doExecute(query, queryType, channel, isExplain), - new TimeValue(0), - SQL_WORKER_THREAD_POOL_NAME); +try { + client + .threadPool() + .schedule( + () -> doExecute(query, queryType, channel, isExplain), + new TimeValue(0), + SQL_WORKER_THREAD_POOL_NAME); +} catch (Exception e) { + recordFailureMetric(e); + reportError(channel, e); +} Suggestion importance[1-10]: 5 __ Why: If `schedule` throws (e.g., unknown thread pool name), the channel would hang without a response. Wrapping in try-catch is a reasonable defensive measure, though this scenario is unlikely in normal operation.	Low
Security	Fix incomplete JSON escaping in error response Manual JSON string construction via concatenation is fragile — it does not handle all special characters (e.g., tab `\t`, null bytes, Unicode control characters), which can produce invalid JSON and potentially expose injection vectors. Use a proper JSON serialization approach or at minimum add escaping for tab and other control characters. plugin/src/main/java/org/opensearch/sql/plugin/rest/RestUnifiedQueryAction.java [284-296] reason = - reason.replace("\\", "\\\\").replace("\"", "\\\"").replace("\n", "\\n").replace("\r", ""); + reason.replace("\\", "\\\\") + .replace("\"", "\\\"") + .replace("\n", "\\n") + .replace("\r", "\\r") + .replace("\t", "\\t"); channel.sendResponse( new BytesRestResponse( status, "application/json; charset=UTF-8", "{\"error\":{\"type\":\"" + e.getClass().getSimpleName() + "\",\"reason\":\"" + reason + "\"},\"status\":" + status.getStatus() + "}")); Suggestion importance[1-10]: 5 __ Why: The manual JSON construction misses tab and other control characters, which could produce invalid JSON. Adding `\t` escaping and ideally using a proper JSON library would be more robust, though the security risk is limited since the content comes from exception messages.	Low
General	Fix substring matching order to avoid wrong stub data `"parquet_logs"` is a substring of `"parquet_logs_archive"` or similar names, and `"parquet_metrics"` check order matters — but more critically, `"parquet_logs"` will also match any plan string that contains `"parquet_logs"` as part of a longer table name (e.g., `parquet_logs_v2`), returning wrong stub data. Use word-boundary or exact matching to avoid false positives. plugin/src/main/java/org/opensearch/sql/plugin/rest/StubQueryPlanExecutor.java [49-59] private String extractTableName(RelNode plan) { - // Use RelOptUtil.toString to get the full plan tree including child nodes String planStr = RelOptUtil.toString(plan); + if (planStr.contains("parquet_metrics")) { + return "parquet_metrics"; + } if (planStr.contains("parquet_logs")) { return "parquet_logs"; - } - if (planStr.contains("parquet_metrics")) { - return "parquet_metrics"; } return null; } Suggestion importance[1-10]: 4 __ Why: Checking `"parquet_metrics"` before `"parquet_logs"` avoids false positives since `"parquet_logs"` is not a substring of `"parquet_metrics"`, but the real concern about `"parquet_logs"` matching `"parquet_logs_v2"` is valid. However, this is a stub for development/testing only, so the impact is limited.	Low

Suggestions up to commit 10d08f2

Category	Suggestion	Impact
Possible issue	Remove NullPointerException from client error classification `NullPointerException` is a programming error (server bug), not a client error. Including it in `isClientError` will cause server-side null pointer bugs to be silently reported as 400 Bad Request, masking real issues and making debugging harder. Remove it from the client error classification. plugin/src/main/java/org/opensearch/sql/plugin/rest/RestUnifiedQueryAction.java [262-267] private static boolean isClientError(Exception e) { return e instanceof SyntaxCheckException \|\| e instanceof SemanticCheckException - \|\| e instanceof IllegalArgumentException - \|\| e instanceof NullPointerException; + \|\| e instanceof IllegalArgumentException; } Suggestion importance[1-10]: 7 __ Why: `NullPointerException` is a programming/server error, not a client error. Classifying it as a client error would return 400 Bad Request for server-side bugs, masking real issues. This is a valid correctness concern.	Medium
Possible issue	Fix table name matching order to avoid substring false positives The check for `"parquet_logs"` will also match `"parquet_logs_archive"` or any other table whose name contains `"parquet_logs"` as a substring, returning wrong stub data. Additionally, `"parquet_logs"` is checked before `"parquet_metrics"`, but since `"parquet_metrics"` does not contain `"parquet_logs"`, the ordering is coincidentally safe — however, the substring matching is still fragile. Use word-boundary or exact-match checks to avoid false positives. plugin/src/main/java/org/opensearch/sql/plugin/rest/StubQueryPlanExecutor.java [49-59] private String extractTableName(RelNode plan) { - // Use RelOptUtil.toString to get the full plan tree including child nodes String planStr = RelOptUtil.toString(plan); + if (planStr.contains("parquet_metrics")) { + return "parquet_metrics"; + } if (planStr.contains("parquet_logs")) { return "parquet_logs"; - } - if (planStr.contains("parquet_metrics")) { - return "parquet_metrics"; } return null; } Suggestion importance[1-10]: 3 __ Why: The suggestion reorders checks to put `parquet_metrics` before `parquet_logs`, but since `parquet_metrics` doesn't contain `parquet_logs` as a substring, the original ordering is already safe. The improvement is minimal and the stub is explicitly temporary code.	Low
General	Fix incomplete JSON escaping in error responses Manual JSON string construction via concatenation is fragile and prone to injection if the error message contains characters like `<`, `>`, or tab characters that are not escaped. Use a proper JSON serialization approach or at minimum add escaping for all control characters and special JSON characters to prevent malformed JSON responses. plugin/src/main/java/org/opensearch/sql/plugin/rest/RestUnifiedQueryAction.java [284-296] -reason = - reason.replace("\\", "\\\\").replace("\"", "\\\"").replace("\n", "\\n").replace("\r", ""); - channel.sendResponse( - new BytesRestResponse( - status, - "application/json; charset=UTF-8", - "{\"error\":{\"type\":\"" - + e.getClass().getSimpleName() - + "\",\"reason\":\"" - + reason - + "\"},\"status\":" - + status.getStatus() - + "}")); +reason = reason.replace("\\", "\\\\") + .replace("\"", "\\\"") + .replace("\n", "\\n") + .replace("\r", "\\r") + .replace("\t", "\\t"); +channel.sendResponse( + new BytesRestResponse( + status, + "application/json; charset=UTF-8", + "{\"error\":{\"type\":\"" + + e.getClass().getSimpleName() + + "\",\"reason\":\"" + + reason + + "\"},\"status\":" + + status.getStatus() + + "}")); Suggestion importance[1-10]: 4 __ Why: The suggestion adds `\t` escaping to the manual JSON construction, which is a minor improvement. However, the core issue of using manual JSON concatenation remains, and the improvement is marginal since tab characters in error messages are uncommon.	Low
General	Document intentional null context passed to plan executor Passing `null` as the execution context to `planExecutor.execute` is hardcoded and ignores any context information available in `CalcitePlanContext`. When the real analytics engine replaces the stub, this will silently pass no context, potentially causing incorrect behavior or NullPointerExceptions inside the executor. Consider passing a meaningful context object derived from `context`, or at minimum document why `null` is intentional here. core/src/main/java/org/opensearch/sql/executor/analytics/AnalyticsExecutionEngine.java [72] +// TODO: Pass a real execution context derived from CalcitePlanContext when analytics engine is ready Iterable<Object[]> rows = planExecutor.execute(plan, null); Suggestion importance[1-10]: 1 __ Why: This suggestion only adds a comment/TODO to existing code without changing behavior. The `improved_code` is functionally identical to the `existing_code`, so it scores very low per the guidelines.	Low

Suggestions up to commit 3b1ecb8

Category	Suggestion	Impact
Security	Fix unsafe manual JSON string construction The error message is manually escaped only for double quotes, but other special characters (backslashes, newlines, control characters) in the exception message can break the JSON structure or cause injection issues. Use a proper JSON serialization library (e.g., Jackson's `ObjectMapper` or `org.json.JSONObject`) to safely serialize the error message. plugin/src/main/java/org/opensearch/sql/plugin/rest/RestUnifiedQueryAction.java [263-273] private static void reportError(RestChannel channel, Exception e) { + String safeMessage = e.getMessage() != null ? e.getMessage() : "Unknown error"; + org.json.JSONObject errorJson = new org.json.JSONObject(); + org.json.JSONObject errorDetail = new org.json.JSONObject(); + errorDetail.put("type", e.getClass().getSimpleName()); + errorDetail.put("reason", safeMessage); + errorJson.put("error", errorDetail); + errorJson.put("status", 500); channel.sendResponse( new BytesRestResponse( org.opensearch.core.rest.RestStatus.INTERNAL_SERVER_ERROR, "application/json; charset=UTF-8", - "{\"error\":{\"type\":\"" - + e.getClass().getSimpleName() - + "\",\"reason\":\"" - + (e.getMessage() != null ? e.getMessage().replace("\"", "\\\"") : "Unknown error") - + "\"},\"status\":500}")); + errorJson.toString())); } Suggestion importance[1-10]: 7 __ Why: The manual JSON construction only escapes double quotes, leaving the response vulnerable to malformed JSON if the exception message contains backslashes, newlines, or other special characters. Using a proper JSON library would be safer and more robust.	Medium
Possible issue	Replace fragile string-matching with proper RelNode tree traversal Using `RelOptUtil.toString(plan)` to extract the table name by string-matching the plan's text representation is fragile and error-prone — it can produce false positives if a table name appears in a filter literal or column alias. The table name should be extracted by traversing the `RelNode` tree and inspecting `TableScan` nodes directly via `RelNode.getTable()`. plugin/src/main/java/org/opensearch/sql/plugin/rest/StubQueryPlanExecutor.java [49-59] private String extractTableName(RelNode plan) { - // Use RelOptUtil.toString to get the full plan tree including child nodes - String planStr = RelOptUtil.toString(plan); - if (planStr.contains("parquet_logs")) { - return "parquet_logs"; + // Walk the RelNode tree to find a TableScan and extract its table name + if (plan instanceof org.apache.calcite.rel.core.TableScan) { + java.util.List<String> qualifiedName = plan.getTable().getQualifiedName(); + if (!qualifiedName.isEmpty()) { + String tableName = qualifiedName.get(qualifiedName.size() - 1); + if (tableName.contains("parquet_logs")) return "parquet_logs"; + if (tableName.contains("parquet_metrics")) return "parquet_metrics"; + } } - if (planStr.contains("parquet_metrics")) { - return "parquet_metrics"; + for (RelNode input : plan.getInputs()) { + String found = extractTableName(input); + if (found != null) return found; } return null; } Suggestion importance[1-10]: 5 __ Why: Using `RelOptUtil.toString(plan)` for table name extraction is fragile and can produce false positives if table names appear in literals or aliases. The suggested tree traversal approach is more correct, though this is a stub class intended to be replaced.	Low
Possible issue	Pass execution context instead of null to plan executor Passing `null` as the execution context to `planExecutor.execute` is undocumented and may cause a `NullPointerException` in the executor implementation. The `CalcitePlanContext` (`context`) is already available in scope and should be passed (or a derived context object) to allow the executor to access query settings, credentials, or other runtime information. core/src/main/java/org/opensearch/sql/executor/analytics/AnalyticsExecutionEngine.java [72] -Iterable<Object[]> rows = planExecutor.execute(plan, null); +Iterable<Object[]> rows = planExecutor.execute(plan, context); Suggestion importance[1-10]: 4 __ Why: Passing `null` as the context is intentional per the `QueryPlanExecutor` interface design (context is "opaque to avoid server dependency"), but passing the available `context` object could provide useful runtime information. However, the interface accepts `Object` and the current design explicitly uses `null`, so this change may not be appropriate without further interface design decisions.	Low
General	Extract hardcoded routing prefix into a named constant The routing check `tableName.startsWith("parquet_")` is a fragile, hardcoded convention that will silently misroute queries if the naming convention changes or if a legitimate non-Parquet index happens to start with "parquet_". This logic should be driven by index metadata (e.g., index settings) rather than a name prefix, and at minimum should be extracted into a named constant or configurable predicate to make the coupling explicit. plugin/src/main/java/org/opensearch/sql/plugin/rest/RestUnifiedQueryAction.java [76-88] +/** Prefix used to identify Parquet-backed indices. Replace with metadata lookup when available. */ +static final String PARQUET_INDEX_PREFIX = "parquet_"; + public static boolean isUnifiedQueryPath(String query) { if (query == null) { return false; } String indexName = extractIndexName(query); if (indexName == null) { return false; } // Handle qualified names like "catalog.parquet_logs" — check the last segment int lastDot = indexName.lastIndexOf('.'); String tableName = lastDot >= 0 ? indexName.substring(lastDot + 1) : indexName; - return tableName.startsWith("parquet_"); + return tableName.startsWith(PARQUET_INDEX_PREFIX); } Suggestion importance[1-10]: 3 __ Why: Extracting `"parquet_"` into a named constant improves maintainability, but the suggestion itself acknowledges this is a temporary stub mechanism. The improvement is minor given the temporary nature of the code.	Low

github-actions · 2026-03-23T23:43:11Z

PR Code Analyzer ❗

AI-powered 'Code-Diff-Analyzer' found issues on commit 10d08f2.

Path	Line	Severity	Description
plugin/src/main/java/org/opensearch/sql/plugin/rest/RestUnifiedQueryAction.java	271	medium	NullPointerException is classified as a client error (4xx) alongside SyntaxCheckException. NPEs are programming bugs, not client errors. This classification could suppress legitimate server-side errors, masking failures and potentially allowing information disclosure about internal state to be treated as benign client faults.
plugin/src/main/java/org/opensearch/sql/plugin/rest/StubQueryPlanExecutor.java	30	low	StubQueryPlanExecutor completely ignores the actual Calcite plan and returns hardcoded rows based solely on table name string matching. If this stub remains in production, any query to parquet_* tables bypasses actual query evaluation (filters, access controls embedded in the plan are ignored). The code is clearly marked as a stub, but its presence in the plugin's main source path (not test) warrants attention.
plugin/src/main/java/org/opensearch/sql/plugin/rest/RestPPLQueryAction.java	95	low	Queries matching parquet_* index prefix are routed to a new execution pipeline that bypasses the standard PPLQueryAction transport action path. If the standard path performs security checks (e.g., authorization, audit logging) that the new path does not replicate, this routing could inadvertently skip those controls. This appears intentional by design but warrants review of the security controls in both paths.

The table above displays the top 10 most important findings.

Total: 3 | Critical: 0 | High: 0 | Medium: 1 | Low: 2

Pull Requests Author(s): Please update your Pull Request according to the report above.

Repository Maintainer(s): You can bypass diff analyzer by adding label skip-diff-analyzer after reviewing the changes carefully, then re-run failed actions. To re-enable the analyzer, remove the label, then re-run all actions.

⚠️ Note: The Code-Diff-Analyzer helps protect against potentially harmful code patterns. Please ensure you have thoroughly reviewed the changes beforehand.

Thanks.

github-actions · 2026-03-23T23:44:16Z

Persistent review updated to latest commit 10d08f2

- Classify client vs server errors in RestUnifiedQueryAction (syntax errors return 400, engine failures return 500) - Fix JSON escaping in error responses for special characters - Add integration tests: explain with filter/aggregation, syntax error handling, response format validation, regression checks - Increment correct metrics (PPL_FAILED_REQ_COUNT_CUS vs SYS) Signed-off-by: Kai Huang <kaihuang@amazon.com> Signed-off-by: Kai Huang <ahkcs@amazon.com>

github-actions · 2026-03-23T23:54:03Z

PR Code Analyzer ❗

AI-powered 'Code-Diff-Analyzer' found issues on commit ccd1665.

Path	Line	Severity	Description
plugin/src/main/java/org/opensearch/sql/plugin/rest/RestPPLQueryAction.java	92	high	Queries targeting 'parquet_' prefixed indices are silently routed to RestUnifiedQueryAction, bypassing the normal PPLQueryAction pipeline. If the standard pipeline enforces authorization, row-level security, or audit logging, this routing logic creates a bypass for any index prefixed with 'parquet_'. The check is purely string-based on the query text, which an attacker controlling index naming could exploit.
plugin/src/main/java/org/opensearch/sql/plugin/rest/StubQueryPlanExecutor.java	1	medium	A stub/development executor is wired into the production code path in SQLPlugin.java and RestPPLQueryAction. This stub uses RelOptUtil.toString(plan) to make routing decisions and returns hardcoded data. Deploying stub code in production is anomalous and could mask the real executor with predictable, controllable behavior.
plugin/src/main/java/org/opensearch/sql/plugin/rest/RestUnifiedQueryAction.java	262	medium	NullPointerException is classified as a client error, causing it to increment PPL_FAILED_REQ_COUNT_CUS instead of the server-side failure metric. This suppresses server-side error visibility and could allow server bugs to appear as client-caused failures, hampering detection of exploitation attempts or internal failures.
core/src/test/java/org/opensearch/sql/executor/analytics/AnalyticsExecutionEngineTest.java	61	low	Test code uses reflection with field.setAccessible(true) to set a final field on a mocked object. While a common testing pattern, this bypasses Java access controls and could be misused if test utilities are accidentally included in production artifacts.

The table above displays the top 10 most important findings.

Total: 4 | Critical: 0 | High: 1 | Medium: 2 | Low: 1

Pull Requests Author(s): Please update your Pull Request according to the report above.

Repository Maintainer(s): You can bypass diff analyzer by adding label skip-diff-analyzer after reviewing the changes carefully, then re-run failed actions. To re-enable the analyzer, remove the label, then re-run all actions.

⚠️ Note: The Code-Diff-Analyzer helps protect against potentially harmful code patterns. Please ensure you have thoroughly reviewed the changes beforehand.

Thanks.

github-actions · 2026-03-23T23:56:58Z

PR Code Analyzer ❗

AI-powered 'Code-Diff-Analyzer' found issues on commit ccd1665.

Path	Line	Severity	Description
plugin/src/main/java/org/opensearch/sql/plugin/rest/RestPPLQueryAction.java	92	medium	Queries targeting 'parquet_*' indices are silently routed away from the standard nodeClient.execute(PPLQueryAction.INSTANCE, ...) transport pipeline. Any authorization, auditing, or security controls implemented in that transport action are bypassed for this new code path. Legitimate for a new engine integration, but warrants review of what checks are skipped.
plugin/src/main/java/org/opensearch/sql/plugin/rest/StubQueryPlanExecutor.java	1	medium	StubQueryPlanExecutor is in the production source tree (plugin/src/main/java), not in test code. It returns hardcoded data for any 'parquet_*' query regardless of query predicates, filters, or access scope. Deploying this to production would serve fabricated data to users querying those indices and suppress real query errors. File is labeled as temporary, but it is wired into the live request path via RestPPLQueryAction.
plugin/src/main/java/org/opensearch/sql/plugin/rest/RestUnifiedQueryAction.java	272	low	NullPointerException is classified as a client error in isClientError(), causing it to be logged at WARN level, counted as a user-caused failure metric, and returned as HTTP 400. NPEs are almost always server-side bugs; misclassifying them obscures real server errors and could make server-side failures harder to detect and alert on.

The table above displays the top 10 most important findings.

Total: 3 | Critical: 0 | High: 0 | Medium: 2 | Low: 1

Pull Requests Author(s): Please update your Pull Request according to the report above.

Repository Maintainer(s): You can bypass diff analyzer by adding label skip-diff-analyzer after reviewing the changes carefully, then re-run failed actions. To re-enable the analyzer, remove the label, then re-run all actions.

⚠️ Note: The Code-Diff-Analyzer helps protect against potentially harmful code patterns. Please ensure you have thoroughly reviewed the changes beforehand.

Thanks.

github-actions · 2026-03-23T23:57:51Z

Persistent review updated to latest commit ccd1665

Signed-off-by: Heng Qian <qianheng@amazon.com>

- Parse explain JSON and verify calcite.logical, calcite.physical (null), calcite.extended (null) structure - Verify exact plan operators (LogicalProject, LogicalFilter, LogicalSort, LogicalAggregate) and their contents (column names, filter conditions) - Add sort plan explain test Signed-off-by: Kai Huang <kaihuang@amazon.com> Signed-off-by: Kai Huang <ahkcs@amazon.com>

github-actions · 2026-03-24T18:48:30Z

PR Code Analyzer ❗

AI-powered 'Code-Diff-Analyzer' found issues on commit ac9ca64.

Path	Line	Severity	Description
plugin/src/main/java/org/opensearch/sql/plugin/rest/RestUnifiedQueryAction.java	262	low	NullPointerException is classified as a client error, causing NPEs to be logged at WARN level and counted against customer error metrics rather than system error metrics. This masks server-side bugs from operators and monitoring systems, but does not appear intentionally malicious.
plugin/src/main/java/org/opensearch/sql/plugin/rest/RestPPLQueryAction.java	52	low	The ClusterService parameter is accepted in the constructor but never used — only NodeClient is consumed. Passing an unused privileged service object is an anomaly worth noting, though no exploit of it is present in this diff.

The table above displays the top 10 most important findings.

Total: 2 | Critical: 0 | High: 0 | Medium: 0 | Low: 2

Pull Requests Author(s): Please update your Pull Request according to the report above.

Repository Maintainer(s): You can bypass diff analyzer by adding label skip-diff-analyzer after reviewing the changes carefully, then re-run failed actions. To re-enable the analyzer, remove the label, then re-run all actions.

⚠️ Note: The Code-Diff-Analyzer helps protect against potentially harmful code patterns. Please ensure you have thoroughly reviewed the changes beforehand.

Thanks.

github-actions · 2026-03-24T18:49:32Z

Persistent review updated to latest commit ac9ca64

ahkcs added PPL Piped processing language enhancement New feature or request labels Mar 23, 2026

ahkcs marked this pull request as ready for review March 23, 2026 23:31

ahkcs marked this pull request as draft March 23, 2026 23:31

ahkcs force-pushed the feature/mustang-analytics-execution-engine branch from 10d08f2 to ccd1665 Compare March 23, 2026 23:53

qianheng-aws and others added 2 commits March 24, 2026 11:47

Init CLAUDE.md (opensearch-project#5259)

d8c1b6e

Signed-off-by: Heng Qian <qianheng@amazon.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add AnalyticsExecutionEngine and query routing for Parquet-backed indices#5258

Add AnalyticsExecutionEngine and query routing for Parquet-backed indices#5258
ahkcs wants to merge 4 commits intoopensearch-project:feature/mustang-ppl-integrationfrom
ahkcs:feature/mustang-analytics-execution-engine

ahkcs commented Mar 23, 2026

Uh oh!

github-actions bot commented Mar 23, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Mar 23, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Mar 23, 2026

Uh oh!

github-actions bot commented Mar 23, 2026

Uh oh!

github-actions bot commented Mar 23, 2026

Uh oh!

github-actions bot commented Mar 23, 2026

Uh oh!

github-actions bot commented Mar 23, 2026

Uh oh!

github-actions bot commented Mar 24, 2026

Uh oh!

github-actions bot commented Mar 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

ahkcs commented Mar 23, 2026

Summary

Test plan

Uh oh!

github-actions bot commented Mar 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Reviewer Guide 🔍

(Review updated until commit ac9ca64)

Uh oh!

github-actions bot commented Mar 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Code Suggestions ✨

Previous suggestions

Uh oh!

github-actions bot commented Mar 23, 2026

PR Code Analyzer ❗

Uh oh!

github-actions bot commented Mar 23, 2026

Uh oh!

github-actions bot commented Mar 23, 2026

PR Code Analyzer ❗

Uh oh!

github-actions bot commented Mar 23, 2026

PR Code Analyzer ❗

Uh oh!

github-actions bot commented Mar 23, 2026

Uh oh!

github-actions bot commented Mar 24, 2026

PR Code Analyzer ❗

Uh oh!

github-actions bot commented Mar 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

github-actions bot commented Mar 23, 2026 •

edited

Loading

(Review updated until commit `ac9ca64`)

github-actions bot commented Mar 23, 2026 •

edited

Loading