feat(format): add type-aware JSONL output using spanvalue extension points#581
feat(format): add type-aware JSONL output using spanvalue extension points#581
Conversation
Add --format=jsonl for structured, machine-readable output where each row is a JSON object with column names as keys. JSONL is naturally safe for complex Spanner types like ARRAY<STRUCT<...>> that contain commas, making it ideal for downstream processing with jq, Go's json.Decoder, etc. The implementation follows the existing StreamingFormatter pattern with streaming enabled by default in AUTO mode. Fixes #554 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Move the jsontext.Encoder to a struct field instead of creating a new one per WriteRow call. The encoder is initialized once in the constructor and reused for all rows, reducing allocations. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Use spanvalue's extension points (FormatComplexPlugins, FormatArray, FormatStruct) to produce valid JSON value strings directly, instead of converting to intermediate Go types. This approach: - Produces proper JSON types: INT64→number, BOOL→boolean, NULL→null, ARRAY→JSON array, STRUCT→JSON object with field names - Uses RawJSONCell marker (not JSONValueCell with Go values) to signal that cell text is valid JSON - Adds JSONValues ValueFormatMode for the JSONL format pipeline - Is structured for easy feedback to spanvalue package later Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Resolve conflicts: keep type-aware JSONL (RawJSONCell, writeValue) over string-only version from squash-merged PR #580. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Leverage structpb.Value.MarshalJSON() for most types instead of handling each type individually. Only INT64 and JSON columns need special handling (StringValue used as number/raw JSON respectively). Also use ValueFmtMode instead of DisplayMode for withRawJSONMarker decision to be consistent with the prepareFormatConfig dispatch. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request enhances the JSONL output format to be type-aware, ensuring that Spanner values are represented with their correct JSON data types rather than as generic strings. This change significantly improves the utility of JSONL output for downstream processing by providing properly structured and typed data. It integrates seamlessly with existing formatting infrastructure by utilizing Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here. Footnotes
|
Code Metrics Report📊 View detailed coverage report (available for 7 days)
Details | | main (c217109) | #581 (3bfccb4) | +/- |
|---------------------|----------------|----------------|-------|
- | Coverage | 70.9% | 70.9% | -0.1% |
| Files | 76 | 78 | +2 |
| Lines | 7053 | 7094 | +41 |
+ | Covered | 5003 | 5032 | +29 |
+ | Code to Test Ratio | 1:1.3 | 1:1.3 | +0.0 |
| Code | 16372 | 16470 | +98 |
+ | Test | 21314 | 21619 | +305 |
+ | Test Execution Time | 1m18s | 1m9s | -9s |Code coverage of files in pull request scope (82.6% → 80.8%)
Reported by octocov |
There was a problem hiding this comment.
Code Review
This pull request introduces a new JSONValues formatting mode for the CLI, enabling Spanner data to be output as valid JSON. This includes a dedicated JSONFormatConfig for correct JSON serialization of various Spanner types, and a RawJSONCell type to signal pre-formatted JSON content. The JSONLFormatter is updated to handle these raw JSON cells appropriately. A review comment suggests an improvement in executeStreamingSQL to use qe.ValueFmtMode for consistency and maintainability when applying the RawJSONCell marker, aligning with the abstraction provided by ValueFormatMode.
The streaming path still used enums.DisplayModeJSONL directly instead of qe.ValueFmtMode == format.JSONValues, inconsistent with the buffered path. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
@gemini-code-assist /request-summary |
|
It looks like you tried to invoke a command, but |
Document the JSONL output format added in #580/#581. - system_variables.md: Add JSONL to CLI_FORMAT valid values, usage example, --format=jsonl flag note, and type-aware JSON description - README.md: Add --format=jsonl as separate bullet (distinct from HTML/XML/CSV escaping story), update --format help text
Summary
Add type-aware JSONL output format (
--format=jsonl) that produces proper JSON types for Spanner values. ARRAY and STRUCT are represented as JSON arrays/objects respectively, INT64 as numbers, BOOL as booleans, and NULL as null.Built on top of #580 (basic JSONL with all-string values), this PR adds the type-aware value formatting layer.
Key Changes
JSONFormatConfig()creates aspanvalue.FormatConfigusing existing extension points (FormatComplexPlugins,FormatArray,FormatStruct) to produce valid JSON value strings. Usesstructpb.Value.MarshalJSON()for most types; only INT64 and JSON columns need special handling.RawJSONCelllightweight marker type signals that cell text is valid JSON (no data carried, unlike the earlierJSONValueCellapproach).writeValue()checksIsRawJSON(cell)to decide betweenWriteValue(raw JSON) andWriteToken(String(...))(quoted string fallback for client-side statements).JSONValuesValueFormatMode for JSONL pipeline dispatch.prepareFormatConfigreturnsdecoder.JSONFormatConfig()forJSONValuesmode.withRawJSONMarkerapplied whenValueFmtMode == JSONValues.withRawJSONMarkerwraps cells withRawJSONCell(no GCV re-extraction, just type wrapping).Development Insights
Discoveries
structpb.Value.MarshalJSON()produces correct JSON for all Spanner types except INT64 (StringValue→quoted) and JSON columns (StringValue→double-quoted). This eliminates the need for per-type handling.spanvalue.FormatComplexPluginscan intercept ALL non-ARRAY/STRUCT types, not just PROTO/ENUM. This enables full JSON formatting via the existing extension point system.CLAUDE.md Integration Candidates
Test Plan
make checkpassesTestJSONFormatConfig: 21 test cases covering all Spanner types (NULL, BOOL, INT64, FLOAT64, STRING, ARRAY, STRUCT, JSON column, nested ARRAY, unnamed fields, NULL ARRAY, NaN/Infinity)TestFormatJSONL: RawJSONCell with typed values and null, plus plain string fallbackTestValueFormatModeFor: JSONL returns JSONValuesTestJSONLFormatterLifecycle: write before init, double init idempotencyFixes #554