Skip to content

Add SQL and PL/pgSQL formatting library#1

Merged
gmr merged 8 commits intomainfrom
feature/initial-library
Mar 28, 2026
Merged

Add SQL and PL/pgSQL formatting library#1
gmr merged 8 commits intomainfrom
feature/initial-library

Conversation

@gmr
Copy link
Copy Markdown
Owner

@gmr gmr commented Mar 28, 2026

Summary

  • Implements a Rust library for formatting PostgreSQL SQL and PL/pgSQL using tree-sitter-postgres for parsing
  • Supports 7 formatting styles matching the Python pgfmt project: river, mozilla, aweber, dbt, gitlab, kickstarter, mattmc3
  • Formats SELECT (CTEs, JOINs, subqueries, UNION, DISTINCT, GROUP BY/HAVING/ORDER BY/LIMIT), INSERT, UPDATE, DELETE, CREATE TABLE/VIEW/FUNCTION/DOMAIN
  • PL/pgSQL support for blocks, declarations, IF/ELSIF/ELSE, FOR/WHILE/LOOP, CASE, RAISE, exception handling
  • 55 fixture tests (from pgfmt) + 6 smoke tests, all passing
  • Zero clippy warnings, cargo fmt clean

Architecture

src/
  lib.rs              — Public API: format(), format_plpgsql()
  error.rs            — FormatError type
  style.rs            — Style enum (7 variants)
  node_helpers.rs     — tree-sitter Node extension traits
  formatter/
    mod.rs            — Formatter struct, StyleConfig
    expr.rs           — Expression formatting
    select.rs         — SELECT formatting (river + left-aligned)
    stmt.rs           — INSERT, UPDATE, DELETE, CREATE TABLE/VIEW/FUNCTION
    plpgsql.rs        — PL/pgSQL blocks and control flow

Test plan

  • cargo test — 61 tests passing
  • cargo clippy — zero warnings
  • cargo fmt --check — clean
  • Manual testing with real-world SQL queries

🤖 Generated with Claude Code

Summary by CodeRabbit

  • New Features

    • PostgreSQL SQL + PL/pgSQL formatter library with public formatting APIs and seven built-in styles (river, mozilla, aweber, dbt, gitlab, kickstarter, mattmc3); broad statement support (SELECT/INSERT/UPDATE/DELETE/CREATE, joins, CTEs, subqueries, functions, domains, materialized views)
  • Tests

    • 100+ golden fixtures across styles and new smoke tests for core formatting behavior
  • Documentation

    • Added developer guide with build, test, lint, and format instructions
  • Chores

    • CI workflows, pre-commit hooks, task automation, and VCS ignore updates

Implements a Rust library for formatting PostgreSQL SQL and PL/pgSQL with
7 style variants (river, mozilla, aweber, dbt, gitlab, kickstarter, mattmc3)
matching the output of the Python pgfmt project.

Uses tree-sitter-postgres for parsing instead of pgparse/lib_pgquery, with
the LANGUAGE grammar for SQL and LANGUAGE_PLPGSQL for PL/pgSQL function bodies.

Supports SELECT (with CTEs, JOINs, subqueries, UNION, DISTINCT, GROUP BY,
HAVING, ORDER BY, LIMIT/OFFSET), INSERT, UPDATE, DELETE, CREATE TABLE,
CREATE VIEW, CREATE FUNCTION, CREATE DOMAIN, and PL/pgSQL blocks with
IF/ELSIF/ELSE, FOR/WHILE/LOOP, CASE, RAISE, and exception handling.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Mar 28, 2026

Caution

Review failed

The head commit changed during the review from 0ecc1e5 to 1bbecde.

📝 Walkthrough

Walkthrough

Adds a new Rust crate libpgfmt implementing PostgreSQL SQL and PL/pgSQL formatting with tree-sitter parsing, a configurable Formatter, error types, node helpers, extensive formatting modules, many golden-file fixtures and tests, and CI/tooling for formatting, linting, testing, and publishing.

Changes

Cohort / File(s) Summary
Project metadata & tooling
\.gitignore, CLAUDE.md, Cargo.toml, Justfile, .pre-commit-config.yaml
Add /target ignore, repo guidance, crate manifest (tree-sitter deps), developer automation (Justfile), and pre-commit hooks.
CI workflows
.github/workflows/testing.yaml, .github/workflows/release.yaml
Add testing and release GitHub Actions (matrix build, fmt/clippy/tests, release tag/version check, cargo publish).
Public API & wiring
src/lib.rs
New crate entry exposing format and format_plpgsql, parser init, structural parse-error detection, and orchestration with Formatter.
Errors & styles
src/error.rs, src/style.rs
New FormatError enum with Display/Error impls; public Style enum with parsing and Display, plus Style::ALL.
Formatter core
src/formatter/mod.rs, src/node_helpers.rs
Formatter struct, StyleConfig, keyword/text helpers, root formatting entrypoints, and NodeExt/flatten_list helpers for tree-sitter nodes.
Formatting modules
src/formatter/expr.rs, src/formatter/select.rs, src/formatter/stmt.rs, src/formatter/plpgsql.rs
Large new modules implementing expression, SELECT, statement, and PL/pgSQL formatting, clause collection, river vs left-aligned rendering, JOIN/CTE handling, table/constraint alignment, and many helper routines — review for logic density and correctness.
Tests & fixtures
tests/fixtures/... (many), tests/fixtures_test.rs, tests/smoke_test.rs
Add ~100+ SQL/.expected fixture pairs across styles and test runners (golden tests and smoke tests).

Sequence Diagram(s)

mermaid
sequenceDiagram
participant Caller
participant libpgfmt
participant TreeSitter
participant Formatter
Caller->>libpgfmt: format(sql, style)
libpgfmt->>TreeSitter: init parser & parse(sql)
TreeSitter-->>libpgfmt: AST root (or parse errors)
alt parse error detected
libpgfmt->>Caller: Err(FormatError::Parser/Syntax)
else parsed successfully
libpgfmt->>Formatter: Formatter::new(source, style) + root
Formatter->>Formatter: walk AST (stmt/select/expr/plpgsql)
Formatter-->>libpgfmt: formatted text
libpgfmt-->>Caller: Ok(formatted text)
end

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~75 minutes

Poem

I hop through tokens, whitespace and trees,
I line up keywords with elegant ease,
From SELECT to PL/pgSQL's nested call,
I tidy your SQL — carrot-powered, that’s all! 🐇✨

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 55.12% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'Add SQL and PL/pgSQL formatting library' directly and clearly describes the main change in the changeset: introducing a new Rust library that formats PostgreSQL SQL and PL/pgSQL code using tree-sitter parsing with support for multiple formatting styles.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feature/initial-library

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 15

🧹 Nitpick comments (2)
tests/fixtures/river/select_subquery_in.expected (1)

5-5: Consider using numeric literal for date year comparison.

The comparison YEAR(championship_date) > '2008' compares an integer result to a string literal. While PostgreSQL handles implicit conversion, using a numeric literal is clearer and more portable.

📝 Suggested improvement
-                       WHERE YEAR(championship_date) > '2008'
+                       WHERE YEAR(championship_date) > 2008
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/fixtures/river/select_subquery_in.expected` at line 5, Replace the
string literal year with a numeric literal in the WHERE clause: find the
predicate "WHERE YEAR(championship_date) > '2008'" and change the right-hand
operand from the quoted string '2008' to the numeric literal 2008 (i.e.,
YEAR(championship_date) > 2008) to avoid implicit type conversion and make the
comparison explicit and clearer.
tests/fixtures/river/select_subquery_nested.sql (1)

1-1: Use PostgreSQL-native year extraction in this fixture and others sharing the same pattern.

This file uses YEAR(...) and compares against text '2008'. For PostgreSQL fixtures in this formatter project, prefer EXTRACT(YEAR FROM ...) and numeric 2008 for semantic correctness. The same pattern appears in tests/fixtures/river/select_subquery_in.sql and tests/fixtures/aweber/select_subquery.sql—all should be updated for consistency.

Proposed fixture update
-SELECT r.last_name, (SELECT MAX(YEAR(championship_date)) FROM champions AS c WHERE c.last_name = r.last_name AND c.confirmed = 'Y') AS last_championship_year FROM riders AS r WHERE r.last_name IN (SELECT c.last_name FROM champions AS c WHERE YEAR(championship_date) > '2008' AND c.confirmed = 'Y')
+SELECT r.last_name, (SELECT MAX(EXTRACT(YEAR FROM championship_date)) FROM champions AS c WHERE c.last_name = r.last_name AND c.confirmed = 'Y') AS last_championship_year FROM riders AS r WHERE r.last_name IN (SELECT c.last_name FROM champions AS c WHERE EXTRACT(YEAR FROM championship_date) > 2008 AND c.confirmed = 'Y')
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/fixtures/river/select_subquery_nested.sql` at line 1, Replace the
non-Postgres YEAR(...) and string literal comparison with PostgreSQL-native
extraction and a numeric literal: change YEAR(championship_date) to EXTRACT(YEAR
FROM championship_date) and change comparisons against the string '2008' to the
number 2008; update both the scalar subquery MAX(YEAR(championship_date)) used
as last_championship_year and the IN-subquery condition YEAR(championship_date)
> '2008' (affecting the expressions referencing championship_date and last_name)
and apply the same edits in the other fixtures that use the same pattern
(select_subquery_in.sql and select_subquery.sql).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/formatter/expr.rs`:
- Around line 272-285: The current whitespace cleanup in the closure inside the
lines().map(...) block (the code that computes leading, prefix, cleaned) uses
split_whitespace() which collapses spaces inside quoted literals and identifiers
(e.g., in `'a  b'`), altering semantics; change the cleaning to collapse
consecutive spaces only when not inside single quotes, double quotes, or
backticks by iterating the suffix string character-by-character and tracking
quote state (toggle on encountering unescaped ' " or `), emitting a single space
for any run of spaces when not in a quote, but preserving all characters
verbatim when inside a quote; update the closure that produces cleaned so it
preserves leading prefix and only normalizes inter-token spacing outside quoted
segments.
- Around line 414-423: The code is incorrectly calling self.kw(&name) and
case-folding user-defined or quoted identifiers (e.g., upper_name) — change the
logic in the function that builds the function/type call strings so you only
apply self.kw(...) to known built-in keywords/aliases and leave user-provided
names (including quoted names like "MyFunc"/"MyType") untouched; specifically,
stop overwriting name with upper_name via self.kw in the block that computes
inner/result (variables: name, upper_name, inner, result; flags: has_star,
has_distinct, args) and add a small whitelist check (or an is_builtin_keyword
function) to call self.kw only for those built-ins; apply the same fix pattern
to the other affected blocks referenced around lines ~723-747, ~780-784, and
~826-841 so quoted/user-defined identifiers are preserved.

In `@src/formatter/plpgsql.rs`:
- Around line 342-364: The current format_stmt_raise drops punctuation and
secondary clauses because it iterates only named_children and trims/join them,
producing e.g. "RAISE NOTICE 'x %' foo;". Update format_stmt_raise to walk all
children (use node.children(...) or include non-named children from the
tree-sitter Node cursor) and append each child's raw text (without trimming) to
parts so separators (commas, percent signs, parentheses) and clauses like USING
are preserved; specifically keep the function name format_stmt_raise and replace
the named_children loop with a children loop that pushes
self.text(child).to_string() (or appropriately preserves the original slice) for
relevant children, and special-case skipping only whitespace/comments if needed,
then join parts with no extra trimming so the original punctuation and USING ...
= ... fragments remain intact.
- Around line 30-32: The formatter currently emits a bare "END" in the kw_end
match arm, producing invalid PL/pgSQL for typical BEGIN...END; blocks; update
the kw_end branch in format_plpgsql (the match arm handling "kw_end" that calls
self.kw("END")) to emit "END;" instead (i.e., include the trailing semicolon in
the formatted output) so the produced block terminates properly; ensure you
modify the lines.push(format!(... self.kw("END"))) usage to include the
semicolon while keeping indentation logic intact.

In `@src/formatter/select.rs`:
- Around line 540-561: format_from_river drops the original commas between
top-level from_list entries so "FROM a, b" becomes "FROM a\n    b"; fix by
preserving/re-emitting separators: when iterating tables from from_list in
format_from_river, detect whether the current table/node is followed by a comma
separator (or simply if i < tables.len() - 1) and append a trailing comma to the
rendered text before pushing the line; for table_ref entries call
format_table_ref and append "," when appropriate, and for joined-table entries
change format_joined_table_river (or add a trailing_comma boolean param) so it
can emit a trailing comma for non-final items; apply the same fix to the other
formatter path referenced (the block around the 1138-1169 region) so both code
paths re-emit commas consistently.
- Around line 1417-1424: The CTE branch currently special-cases only SelectStmt
(using self.format_select_stmt) and thus drops other valid PreparableStmt bodies
(e.g., DELETE ... RETURNING); change the logic that matches PreparableStmt to
call the general statement formatter for the PreparableStmt node instead of only
handling SelectStmt — i.e., when you find a "PreparableStmt" child, pass that
node into the statement formatter (e.g., self.format_stmt or the central
stmt-formatting method used elsewhere) and use its returned body; apply the same
fix for the other occurrence noted (lines similar to 1446-1451) so all
non-SELECT CTE bodies are rendered.
- Around line 817-968: The function split_top_level_conditions currently
iterates UTF-8 by casting bytes (bytes[i] as char), which corrupts multi-byte
chars; change the loop to iterate over full_text.char_indices() (or use chars
with tracked byte positions) instead of raw byte indexing so all character
checks (quotes, $, parentheses, spaces, keyword boundaries, and tag detection)
operate on proper Unicode scalar values and use byte indices only when slicing
valid UTF-8 ranges; update uses of bytes, len, i, tag_start/tag_end,
next_is_boundary checks, and substring operations (e.g.,
full_text[tag_start..=tag_end], full_text[i..]) to derive correct byte offsets
from char_indices so quoted strings, identifiers, and keywords preserve
non-ASCII text.

In `@src/formatter/stmt.rs`:
- Around line 803-819: The function format_relation_expr_opt_alias is
incorrectly adding an extra "AS" for nodes whose alias_clause/opt_alias_clause
already include "AS"; update the match so that "alias_clause" and
"opt_alias_clause" simply push the result of self.format_expr(child) (respecting
the empty check) without prepending self.kw("AS"), while keeping the
special-case that only bare "ColId" gets wrapped as format!("{} {alias}",
self.kw("AS")). Ensure you modify the match arms in
format_relation_expr_opt_alias accordingly so alias_clause/opt_alias_clause use
the formatted expression directly and ColId retains the added "AS".
- Around line 709-715: The function format_func_args currently runs
normalize_whitespace(self.text(node)) which will collapse whitespace inside
string literals, dollar-quoted bodies and quoted identifiers; replace that call
so the raw SQL fragment is returned unchanged (i.e. return self.text(node) or an
equivalent that only trims outer-space but does not split internal whitespace)
instead of using split_whitespace-based normalization; apply the same change to
the other formatter call sites flagged in this file that use
normalize_whitespace for argument lists/whole statements so quoted/dollar-quoted
content is preserved.
- Around line 77-123: The code is converting every SelectStmt in insert_rest
into a VALUES block under self.config.river, producing invalid "VALUES SELECT
..." SQL; change the branch to only apply river-specific VALUES alignment when
the original insert body is actually a VALUES clause (e.g., when the incoming
text or AST indicates a VALUES keyword) instead of for any SelectStmt.
Concretely, in the block that inspects select_text/SelectStmt before calling
river_line or applying VALUES padding, add a guard that checks the original
clause is VALUES (for example test if
select_text.trim_start().to_uppercase().starts_with("VALUES") or an AST flag on
the insert_rest node) and only then run the current river formatting (using
river_line, leading_commas logic, etc.); otherwise push select_text.to_string()
unchanged into parts so real "INSERT ... SELECT ..." remains intact.

In `@src/lib.rs`:
- Around line 21-26: The code currently appends ";" directly to trimmed which
puts the semicolon inside trailing `--` comments; change the else branch that
builds `input` so the semicolon is placed on its own new line (e.g. build the
string as trimmed + "\n;" or format!("{trimmed}\n;")), leaving the if branch
unchanged; update the `input` construction that references `trimmed` and `input`
so the semicolon is not appended inline and therefore won't become part of a
comment.

In `@tests/fixtures/mozilla/select_using_join.expected`:
- Around line 1-13: A test is missing for the select_using_join fixture pair;
add a new test function named mozilla_select_using_join in the fixtures test
module that uses the existing fixture test helper (the same pattern as other
mozilla_* tests) to load "select_using_join" from the mozilla fixtures and
assert the SQL output matches select_using_join.expected; alternatively, if you
prefer not to test it, remove the unused select_using_join.sql/.expected fixture
files so they don't remain unexercised.

In `@tests/fixtures/river/create_matview.expected`:
- Around line 26-27: The formatter currently strips parentheses around the WHERE
expression "(sp.cancelled_at IS NULL OR sp.cancelled_at > CURRENT_TIMESTAMP) AND
f.feature_name = 'notifications'::text", changing semantics due to AND/OR
precedence; update the formatter logic that removes parentheses to detect
operator precedence and preserve parentheses when their removal would change
evaluation order (specifically for expressions involving tokens like
sp.cancelled_at, CURRENT_TIMESTAMP and f.feature_name and the OR/AND operators).
Ensure the parentheses-preservation check runs in the node/type that handles
binary logical expressions so the output keeps the original grouping whenever
the inner operator has lower precedence than the surrounding operator.

In `@tests/fixtures/river/create_matview.sql`:
- Line 1: The formatted output is dropping necessary parentheses around the
WHERE clause in the CREATE MATERIALIZED VIEW statement, changing semantics;
update the SQL formatter/printer to preserve grouping for the expression
"(sp.cancelled_at IS NULL OR sp.cancelled_at > CURRENT_TIMESTAMP) AND
f.feature_name = 'notifications'" in the CREATE MATERIALIZED VIEW ... AS SELECT
... WHERE ... clause so the OR is evaluated as a single grouped term before the
AND. Locate the code that renders WHERE expressions in the SQL pretty-printer
(the logic that prints boolean binary expressions for the CREATE MATERIALIZED
VIEW statement) and change it to emit parentheses when the AST shows an OR node
nested inside an AND context (or otherwise respect original grouping), ensuring
the printed query for the materialized view retains the parentheses and original
precedence.

In `@tests/fixtures/river/insert_values.sql`:
- Line 1: The second INSERT row for the albums table contains an inconsistent
recording_date for title "The New Danger" (recording_date = '1990-01-01...' vs
release_date = '2008-01-01...'), which looks like a copy-paste error; update the
recording_date in the INSERT INTO albums (...) VALUES statement for "The New
Danger" to a plausible value (e.g., match the release_date or the intended
recording year) so recording_date and release_date are consistent for that row.

---

Nitpick comments:
In `@tests/fixtures/river/select_subquery_in.expected`:
- Line 5: Replace the string literal year with a numeric literal in the WHERE
clause: find the predicate "WHERE YEAR(championship_date) > '2008'" and change
the right-hand operand from the quoted string '2008' to the numeric literal 2008
(i.e., YEAR(championship_date) > 2008) to avoid implicit type conversion and
make the comparison explicit and clearer.

In `@tests/fixtures/river/select_subquery_nested.sql`:
- Line 1: Replace the non-Postgres YEAR(...) and string literal comparison with
PostgreSQL-native extraction and a numeric literal: change
YEAR(championship_date) to EXTRACT(YEAR FROM championship_date) and change
comparisons against the string '2008' to the number 2008; update both the scalar
subquery MAX(YEAR(championship_date)) used as last_championship_year and the
IN-subquery condition YEAR(championship_date) > '2008' (affecting the
expressions referencing championship_date and last_name) and apply the same
edits in the other fixtures that use the same pattern (select_subquery_in.sql
and select_subquery.sql).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: f0725bb9-fde6-4a99-a402-d748b86da20e

📥 Commits

Reviewing files that changed from the base of the PR and between 08d4d80 and 359fb66.

⛔ Files ignored due to path filters (1)
  • Cargo.lock is excluded by !**/*.lock
📒 Files selected for processing (144)
  • .gitignore
  • CLAUDE.md
  • Cargo.toml
  • src/error.rs
  • src/formatter/expr.rs
  • src/formatter/mod.rs
  • src/formatter/plpgsql.rs
  • src/formatter/select.rs
  • src/formatter/stmt.rs
  • src/lib.rs
  • src/node_helpers.rs
  • src/style.rs
  • tests/fixtures/aweber/select_case_join.expected
  • tests/fixtures/aweber/select_case_join.sql
  • tests/fixtures/aweber/select_cte_nested.expected
  • tests/fixtures/aweber/select_cte_nested.sql
  • tests/fixtures/aweber/select_cte_union.expected
  • tests/fixtures/aweber/select_cte_union.sql
  • tests/fixtures/aweber/select_join.expected
  • tests/fixtures/aweber/select_join.sql
  • tests/fixtures/aweber/select_left_join.expected
  • tests/fixtures/aweber/select_left_join.sql
  • tests/fixtures/aweber/select_or.expected
  • tests/fixtures/aweber/select_or.sql
  • tests/fixtures/aweber/select_simple.expected
  • tests/fixtures/aweber/select_simple.sql
  • tests/fixtures/aweber/select_subquery.expected
  • tests/fixtures/aweber/select_subquery.sql
  • tests/fixtures/dbt/select_cte.expected
  • tests/fixtures/dbt/select_cte.sql
  • tests/fixtures/dbt/select_group_order.expected
  • tests/fixtures/dbt/select_group_order.sql
  • tests/fixtures/dbt/select_join.expected
  • tests/fixtures/dbt/select_join.sql
  • tests/fixtures/dbt/select_simple.expected
  • tests/fixtures/dbt/select_simple.sql
  • tests/fixtures/gitlab/select_cte.expected
  • tests/fixtures/gitlab/select_cte.sql
  • tests/fixtures/gitlab/select_group_order.expected
  • tests/fixtures/gitlab/select_group_order.sql
  • tests/fixtures/gitlab/select_join.expected
  • tests/fixtures/gitlab/select_join.sql
  • tests/fixtures/gitlab/select_simple.expected
  • tests/fixtures/gitlab/select_simple.sql
  • tests/fixtures/kickstarter/select_cte.expected
  • tests/fixtures/kickstarter/select_cte.sql
  • tests/fixtures/kickstarter/select_join.expected
  • tests/fixtures/kickstarter/select_join.sql
  • tests/fixtures/kickstarter/select_simple.expected
  • tests/fixtures/kickstarter/select_simple.sql
  • tests/fixtures/kickstarter/select_where.expected
  • tests/fixtures/kickstarter/select_where.sql
  • tests/fixtures/mattmc3/insert_values.expected
  • tests/fixtures/mattmc3/insert_values.sql
  • tests/fixtures/mattmc3/select_join.expected
  • tests/fixtures/mattmc3/select_join.sql
  • tests/fixtures/mattmc3/select_or.expected
  • tests/fixtures/mattmc3/select_or.sql
  • tests/fixtures/mattmc3/select_simple.expected
  • tests/fixtures/mattmc3/select_simple.sql
  • tests/fixtures/mattmc3/update_multi.expected
  • tests/fixtures/mattmc3/update_multi.sql
  • tests/fixtures/mozilla/create_table.expected
  • tests/fixtures/mozilla/create_table.sql
  • tests/fixtures/mozilla/delete_and.expected
  • tests/fixtures/mozilla/delete_and.sql
  • tests/fixtures/mozilla/insert_multi.expected
  • tests/fixtures/mozilla/insert_multi.sql
  • tests/fixtures/mozilla/select_cte.expected
  • tests/fixtures/mozilla/select_cte.sql
  • tests/fixtures/mozilla/select_group_order.expected
  • tests/fixtures/mozilla/select_group_order.sql
  • tests/fixtures/mozilla/select_join.expected
  • tests/fixtures/mozilla/select_join.sql
  • tests/fixtures/mozilla/select_simple.expected
  • tests/fixtures/mozilla/select_simple.sql
  • tests/fixtures/mozilla/select_single_col.expected
  • tests/fixtures/mozilla/select_single_col.sql
  • tests/fixtures/mozilla/select_subquery.expected
  • tests/fixtures/mozilla/select_subquery.sql
  • tests/fixtures/mozilla/select_union.expected
  • tests/fixtures/mozilla/select_union.sql
  • tests/fixtures/mozilla/select_using_join.expected
  • tests/fixtures/mozilla/select_using_join.sql
  • tests/fixtures/mozilla/update_multi.expected
  • tests/fixtures/mozilla/update_multi.sql
  • tests/fixtures/river/create_domain.expected
  • tests/fixtures/river/create_domain.sql
  • tests/fixtures/river/create_foreign_table.expected
  • tests/fixtures/river/create_foreign_table.sql
  • tests/fixtures/river/create_function.expected
  • tests/fixtures/river/create_function.sql
  • tests/fixtures/river/create_matview.expected
  • tests/fixtures/river/create_matview.sql
  • tests/fixtures/river/create_table.expected
  • tests/fixtures/river/create_table.sql
  • tests/fixtures/river/create_table_with.expected
  • tests/fixtures/river/create_table_with.sql
  • tests/fixtures/river/create_view_cte.expected
  • tests/fixtures/river/create_view_cte.sql
  • tests/fixtures/river/delete_simple.expected
  • tests/fixtures/river/delete_simple.sql
  • tests/fixtures/river/insert_values.expected
  • tests/fixtures/river/insert_values.sql
  • tests/fixtures/river/select_agg_functions.expected
  • tests/fixtures/river/select_agg_functions.sql
  • tests/fixtures/river/select_alias.expected
  • tests/fixtures/river/select_alias.sql
  • tests/fixtures/river/select_and.expected
  • tests/fixtures/river/select_and.sql
  • tests/fixtures/river/select_case.expected
  • tests/fixtures/river/select_case.sql
  • tests/fixtures/river/select_cte.expected
  • tests/fixtures/river/select_cte.sql
  • tests/fixtures/river/select_distinct.expected
  • tests/fixtures/river/select_distinct.sql
  • tests/fixtures/river/select_group_by.expected
  • tests/fixtures/river/select_group_by.sql
  • tests/fixtures/river/select_having.expected
  • tests/fixtures/river/select_having.sql
  • tests/fixtures/river/select_join.expected
  • tests/fixtures/river/select_join.sql
  • tests/fixtures/river/select_or.expected
  • tests/fixtures/river/select_or.sql
  • tests/fixtures/river/select_order_limit.expected
  • tests/fixtures/river/select_order_limit.sql
  • tests/fixtures/river/select_simple.expected
  • tests/fixtures/river/select_simple.sql
  • tests/fixtures/river/select_subquery_exists.expected
  • tests/fixtures/river/select_subquery_exists.sql
  • tests/fixtures/river/select_subquery_in.expected
  • tests/fixtures/river/select_subquery_in.sql
  • tests/fixtures/river/select_subquery_nested.expected
  • tests/fixtures/river/select_subquery_nested.sql
  • tests/fixtures/river/select_subquery_scalar.expected
  • tests/fixtures/river/select_subquery_scalar.sql
  • tests/fixtures/river/select_union.expected
  • tests/fixtures/river/select_union.sql
  • tests/fixtures/river/update_multi_set.expected
  • tests/fixtures/river/update_multi_set.sql
  • tests/fixtures/river/update_simple.expected
  • tests/fixtures/river/update_simple.sql
  • tests/fixtures_test.rs
  • tests/smoke_test.rs

@@ -0,0 +1 @@
CREATE MATERIALIZED VIEW report.service_subscription_info AS SELECT sp.account_id, sp.subscription_id, sp.plan_id, bt.term, p.plan_name, pf.units_included, pf.units_increment, pf.price_increment, sp.cancelled_at FROM public.subscriptions sp JOIN public.plans p USING (plan_id) JOIN public.plan_details pd USING (plan_id) JOIN public.pricing_fees pf USING (plan_detail_id) JOIN public.features f USING (feature_id) JOIN public.billing_terms bt USING (billing_term_id) WHERE (sp.cancelled_at IS NULL OR sp.cancelled_at > CURRENT_TIMESTAMP) AND f.feature_name = 'notifications'::text ORDER BY sp.cancelled_at DESC, sp.started_at WITH NO DATA
Copy link
Copy Markdown

@coderabbitai coderabbitai bot Mar 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Input is correct; confirms expected output bug.

This input correctly uses parentheses around the OR clause: (sp.cancelled_at IS NULL OR sp.cancelled_at > CURRENT_TIMESTAMP) AND f.feature_name = 'notifications'. However, as noted in the review of create_matview.expected, these parentheses are being dropped in the formatted output, which changes the query semantics.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/fixtures/river/create_matview.sql` at line 1, The formatted output is
dropping necessary parentheses around the WHERE clause in the CREATE
MATERIALIZED VIEW statement, changing semantics; update the SQL
formatter/printer to preserve grouping for the expression "(sp.cancelled_at IS
NULL OR sp.cancelled_at > CURRENT_TIMESTAMP) AND f.feature_name =
'notifications'" in the CREATE MATERIALIZED VIEW ... AS SELECT ... WHERE ...
clause so the OR is evaluated as a single grouped term before the AND. Locate
the code that renders WHERE expressions in the SQL pretty-printer (the logic
that prints boolean binary expressions for the CREATE MATERIALIZED VIEW
statement) and change it to emit parentheses when the AST shows an OR node
nested inside an AND context (or otherwise respect original grouping), ensuring
the printed query for the materialized view retains the parentheses and original
precedence.

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above — this is the reference behavior from Python pgfmt. Preserving user parentheses is a future improvement that tree-sitter-postgres makes possible.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gmr, understood — dropping user-supplied parentheses in WHERE (and other) expressions matches the Python pgfmt reference behavior, and preserving original grouping parentheses via the tree-sitter-postgres AST is a planned future improvement. I'll note this for future reviews.

Would you like me to open a GitHub issue to track the parentheses-preservation enhancement?


✏️ Learnings added
Learnt from: gmr
Repo: gmr/libpgfmt PR: 1
File: tests/fixtures/river/create_matview.sql:1-1
Timestamp: 2026-03-28T23:34:50.592Z
Learning: In the `libpgfmt` Rust crate (repository: gmr/libpgfmt), dropping user-supplied parentheses in WHERE clauses (and other boolean expressions) is intentional — it matches the reference behavior of the Python pgfmt project. Preserving original grouping parentheses is a planned future improvement that tree-sitter-postgres makes possible. Do not flag parentheses being dropped in formatted SQL output as a bug.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
gh issue create \
  --repo gmr/libpgfmt \
  --title "Preserve user-supplied grouping parentheses in boolean expressions" \
  --body "## Summary

When formatting SQL, \`libpgfmt\` currently drops user-supplied grouping parentheses in \`WHERE\` clauses and other boolean expressions. For example, an expression like:

\`\`\`sql
WHERE (sp.cancelled_at IS NULL OR sp.cancelled_at > CURRENT_TIMESTAMP) AND f.feature_name = 'notifications'
\`\`\`

is formatted as:

\`\`\`sql
WHERE sp.cancelled_at IS NULL OR sp.cancelled_at > CURRENT_TIMESTAMP AND f.feature_name = 'notifications'
\`\`\`

This matches the reference behavior of the Python pgfmt project (which uses pgparse/libpg_query and normalizes away such parentheses at the AST level). However, since \`libpgfmt\` uses \`tree-sitter-postgres\` (a concrete syntax tree), it is technically feasible to detect and preserve these user-supplied grouping parentheses.

## Background

Discussed in PR `#1` (https://github.com/gmr/libpgfmt/pull/1) at comment https://github.com/gmr/libpgfmt/pull/1#discussion_r3005377443.

Raised by `@gmr` — dropping parentheses is current intentional behavior to match Python pgfmt, with preservation as a planned future improvement.

## Goal

Update the expression formatter so that when an \`OR\` node (or any lower-precedence boolean operator) is nested inside a higher-precedence boolean context (e.g., \`AND\`), the parentheses originally present in the source are emitted in the formatted output to preserve the intended evaluation order and semantics.

## Notes

- The \`tree-sitter-postgres\` CST retains the parenthesized grouping nodes, making this feasible without external heuristics.
- Care should be taken to only restore parentheses that were present in the original source, not to add new ones based on operator precedence alone.
- Requested by `@gmr` in PR `#1`.
"

Length of output: 40


@gmr, I've opened the tracking issue: #2 — "Preserve user-supplied grouping parentheses in boolean expressions". It captures the background, the current behavior (matching Python pgfmt), the feasibility via tree-sitter-postgres's CST, and the goal for a future iteration.


🧠 Learnings used
Learnt from: gmr
Repo: gmr/libpgfmt PR: 1
File: tests/fixtures/river/create_matview.sql:1-1
Timestamp: 2026-03-28T23:34:50.592Z
Learning: When reviewing formatted SQL output produced by the `libpgfmt` Rust crate (gmr/libpgfmt), treat the dropping of user-supplied grouping parentheses in `WHERE` clauses and other boolean expressions as intentional. This matches the Python pgfmt reference behavior, and preserving original grouping parentheses is not currently required (planned future improvement). Do not flag missing parentheses in the formatter output as a bug or correctness regression; only flag actual semantic changes.

Learnt from: gmr
Repo: gmr/libpgfmt PR: 1
File: tests/fixtures/river/create_matview.expected:26-27
Timestamp: 2026-03-28T23:34:50.298Z
Learning: In the libpgfmt Rust library (gmr/libpgfmt), the formatter intentionally drops parentheses that alter AND/OR operator precedence in WHERE clauses (e.g., `(a IS NULL OR a > X) AND b = Y` becomes `a IS NULL OR a > X AND b = Y`). This is by design to match the Python pgfmt reference implementation, which uses pgparse/libpg_query and normalizes away such parentheses at the AST level before formatting. Since libpgfmt uses tree-sitter-postgres (concrete syntax tree), preserving these parentheses is technically feasible and is tracked as a known limitation for a future iteration.

Learnt from: gmr
Repo: gmr/libpgfmt PR: 1
File: tests/fixtures/river/insert_values.sql:1-1
Timestamp: 2026-03-28T23:34:52.059Z
Learning: In the libpgfmt Rust crate test fixtures under tests/fixtures/ (including tests/fixtures/**/*.sql), treat fixture data values (e.g., album titles, dates, other placeholder literals) as intentionally-copied test data from the upstream pgfmt/Python SQL Style Guide examples. Do not flag these values for semantic inconsistencies or correctness issues when reviewing fixtures; focus review on fixture structure/formatting and deterministic test behavior instead.

gmr and others added 3 commits March 28, 2026 18:06
Justfile provides recipes for build, test, lint, fmt, set-version, release,
and publish workflows. The set-version recipe updates Cargo.toml and verifies
the result compiles. The release recipe sets the version, commits, tags, and
pushes in one step.

GitHub Actions:
- testing.yaml: runs fmt, clippy, and tests on stable and nightly Rust for
  all pushes and PRs
- release.yaml: publishes to crates.io on GitHub release creation, with a
  version/tag consistency check

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Runs cargo fmt --check and cargo clippy on pre-commit, and cargo test
on pre-push.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Critical fixes:
- INSERT SELECT no longer misformatted as VALUES (check for VALUES keyword
  before applying VALUES-specific river alignment)
- FROM comma-separated tables now preserve commas between non-join entries
- Semicolons appended after line comments go on a new line to avoid becoming
  part of the comment
- Add missing mozilla_select_using_join test; fix expected output for
  non-explicit JOIN style

Correctness fixes:
- RAISE statements preserve commas between arguments
- PL/pgSQL outer block ends with END; (semicolon) for valid body text
- CTE bodies support INSERT/UPDATE/DELETE RETURNING, not just SELECT
- UTF-8 safe: split_top_level_conditions iterates chars not bytes
- Whitespace normalization in format_c_expr preserves spaces inside quotes
- UPDATE/DELETE alias formatting no longer doubles the AS keyword

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (4)
Justfile (1)

44-49: Release recipe hardcodes branch name main.

The git push origin main --tags command assumes the default branch is main. This is a minor issue for this repository but could cause confusion if the branch is renamed or when adapting this Justfile elsewhere.

Optional: Use current branch dynamically
 # Tag a release (sets version, commits, tags, pushes)
 release version: (set-version version)
     git add Cargo.toml Cargo.lock
     git commit -m "Release v{{version}}"
     git tag -a "v{{version}}" -m "v{{version}}"
-    git push origin main --tags
+    git push origin HEAD --tags
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@Justfile` around lines 44 - 49, The release recipe hardcodes the branch name
in the push step (git push origin main --tags); update the release recipe (the
"release version: (set-version version)" recipe) to determine the current branch
dynamically (e.g., via git rev-parse --abbrev-ref HEAD or equivalent in the
Justfile shell) and use that variable in the git push command so it pushes the
current branch with tags instead of always pushing "main".
.github/workflows/release.yaml (1)

36-43: Version verification assumes v prefix but doesn't validate tag format.

If a release is created with a tag that doesn't follow the v<version> convention (e.g., 1.0.0 instead of v1.0.0), the parameter expansion ${GITHUB_REF#refs/tags/v} won't strip anything meaningful, leading to a confusing mismatch error.

Consider adding a guard to fail early with a clear message if the tag doesn't start with v:

Proposed fix to validate tag format
       - name: Verify version matches tag
         run: |
-          TAG="${GITHUB_REF#refs/tags/v}"
+          RAW_TAG="${GITHUB_REF#refs/tags/}"
+          if [[ ! "$RAW_TAG" =~ ^v ]]; then
+            echo "Error: Release tag '$RAW_TAG' must start with 'v' (e.g., v1.0.0)"
+            exit 1
+          fi
+          TAG="${RAW_TAG#v}"
           CARGO_VERSION=$(grep '^version' Cargo.toml | head -1 | sed 's/.*"\(.*\)"/\1/')
           if [ "$TAG" != "$CARGO_VERSION" ]; then
             echo "Tag v$TAG does not match Cargo.toml version $CARGO_VERSION"
             exit 1
           fi
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/release.yaml around lines 36 - 43, The current Verify
version matches tag step assumes tags are prefixed with "v" and uses
TAG="${GITHUB_REF#refs/tags/v}" which will produce misleading errors for tags
like "1.0.0"; update the verification to first validate the tag format by
checking GITHUB_REF starts with "refs/tags/v" (or that TAG begins with "v") and
fail early with a clear error message if it doesn't, then proceed to extract TAG
and compare against CARGO_VERSION (refer to the existing TAG, GITHUB_REF, and
CARGO_VERSION variables and the Verify version matches tag block).
src/formatter/plpgsql.rs (1)

298-303: FOREACH statement is passed through without formatting.

This function simply returns the raw source text with indentation. While acceptable as a fallback for complex syntax, it means FOREACH statements won't benefit from keyword casing or style-specific formatting.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/formatter/plpgsql.rs` around lines 298 - 303, format_stmt_foreach
currently just emits raw text, so FOREACH statements bypass keyword casing and
structured formatting; update format_stmt_foreach to walk the node's children
and compose the statement using existing helpers (e.g., call self.format_node
for child nodes, apply keyword casing for "FOREACH", "SLICE", "IN", "ARRAY",
"LOOP", "END LOOP"), build the line with the proper indent and spacing, and push
the formatted string into lines instead of the raw self.text(node) to ensure
consistent formatting with other statements.
src/formatter/stmt.rs (1)

661-670: Fragile text search for WITH NO DATA.

Line 663 uses self.text(node).contains("WITH NO DATA") which is case-sensitive. If the input uses lowercase with no data, this check fails. The node.has_child("kw_no") check provides a fallback, but consider making the text search case-insensitive.

♻️ Suggested improvement
-        if node.has_child("kw_no") || self.text(node).contains("WITH NO DATA") {
+        let node_text_upper = self.text(node).to_uppercase();
+        if node.has_child("kw_no") || node_text_upper.contains("WITH NO DATA") {
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/formatter/stmt.rs` around lines 661 - 670, The current check for "WITH NO
DATA" is case-sensitive because it uses self.text(node).contains("WITH NO
DATA"); update the condition in the block that sets suffix (the code using
node.has_child("kw_no"), self.text(node), suffix, and self.kw(...)) to perform a
case-insensitive match instead (for example by normalizing the text via
to_lowercase()/to_uppercase() before calling contains or using a
case-insensitive regex) so inputs like "with no data" also match; keep the
existing node.has_child("kw_no") fallback and only change the text-search
branch.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@Justfile`:
- Around line 40-41: The Justfile recipe uses macOS-only `sed -i ''` and hides
errors from `cargo check`; change the sed invocation in that recipe to a
portable form (e.g., use an in-place edit that creates a temporary backup then
removes it, or use a cross-platform tool like perl -pi) to replace the version
line in Cargo.toml, and stop suppressing stderr from `cargo check` (remove the
`2>/dev/null` redirection) so real compilation errors are visible; if the recipe
must remain macOS-only, instead add a clear comment stating that assumption.

In `@src/formatter/plpgsql.rs`:
- Around line 245-249: The codebase uses the let_chains feature (the `if let ...
&& let` pattern seen in src/formatter/plpgsql.rs around the
`node.find_child("loop_body")` / `body.find_child("proc_sect")` usage and the
`format_proc_sect` call), which requires Rust 1.88.0; to prevent CI/devs on
older stable toolchains from failing, add an explicit minimum toolchain by
adding rust-version = "1.88" to the Cargo.toml under the [package] section so
the compiler enforces MSRV 1.88.0.

---

Nitpick comments:
In @.github/workflows/release.yaml:
- Around line 36-43: The current Verify version matches tag step assumes tags
are prefixed with "v" and uses TAG="${GITHUB_REF#refs/tags/v}" which will
produce misleading errors for tags like "1.0.0"; update the verification to
first validate the tag format by checking GITHUB_REF starts with "refs/tags/v"
(or that TAG begins with "v") and fail early with a clear error message if it
doesn't, then proceed to extract TAG and compare against CARGO_VERSION (refer to
the existing TAG, GITHUB_REF, and CARGO_VERSION variables and the Verify version
matches tag block).

In `@Justfile`:
- Around line 44-49: The release recipe hardcodes the branch name in the push
step (git push origin main --tags); update the release recipe (the "release
version: (set-version version)" recipe) to determine the current branch
dynamically (e.g., via git rev-parse --abbrev-ref HEAD or equivalent in the
Justfile shell) and use that variable in the git push command so it pushes the
current branch with tags instead of always pushing "main".

In `@src/formatter/plpgsql.rs`:
- Around line 298-303: format_stmt_foreach currently just emits raw text, so
FOREACH statements bypass keyword casing and structured formatting; update
format_stmt_foreach to walk the node's children and compose the statement using
existing helpers (e.g., call self.format_node for child nodes, apply keyword
casing for "FOREACH", "SLICE", "IN", "ARRAY", "LOOP", "END LOOP"), build the
line with the proper indent and spacing, and push the formatted string into
lines instead of the raw self.text(node) to ensure consistent formatting with
other statements.

In `@src/formatter/stmt.rs`:
- Around line 661-670: The current check for "WITH NO DATA" is case-sensitive
because it uses self.text(node).contains("WITH NO DATA"); update the condition
in the block that sets suffix (the code using node.has_child("kw_no"),
self.text(node), suffix, and self.kw(...)) to perform a case-insensitive match
instead (for example by normalizing the text via to_lowercase()/to_uppercase()
before calling contains or using a case-insensitive regex) so inputs like "with
no data" also match; keep the existing node.has_child("kw_no") fallback and only
change the text-search branch.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: f531c2ec-a95f-489f-b298-fb0eacd7aa94

📥 Commits

Reviewing files that changed from the base of the PR and between 359fb66 and 2c47197.

📒 Files selected for processing (11)
  • .github/workflows/release.yaml
  • .github/workflows/testing.yaml
  • .pre-commit-config.yaml
  • Justfile
  • src/formatter/expr.rs
  • src/formatter/mod.rs
  • src/formatter/plpgsql.rs
  • src/formatter/select.rs
  • src/formatter/stmt.rs
  • tests/fixtures/mozilla/select_using_join.expected
  • tests/fixtures_test.rs
✅ Files skipped from review due to trivial changes (5)
  • .pre-commit-config.yaml
  • tests/fixtures/mozilla/select_using_join.expected
  • tests/fixtures_test.rs
  • .github/workflows/testing.yaml
  • src/formatter/expr.rs

- Fix semicolon synthesis for inputs ending with line comments (-- ...)
  by appending on a new line
- Make normalize_whitespace quote-aware: preserves spaces inside
  single-quoted strings, double-quoted identifiers, and dollar-quoted strings
- Preserve user-defined function name casing; only apply keyword casing to
  known SQL built-in functions (COUNT, SUM, MAX, NOW, etc.)
- Fix Justfile sed portability: use temp file instead of BSD-specific
  sed -i '' syntax, and don't suppress cargo check errors

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (3)
src/lib.rs (1)

76-97: Consider documenting the magic number.

The threshold of size > 4 bytes at line 85 is a heuristic. While the comment explains the use case (decimal fractions like ".00"), consider extracting this to a named constant for clarity:

const MAX_TOLERABLE_ERROR_BYTES: usize = 4;
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/lib.rs` around lines 76 - 97, The magic number 4 in has_structural_error
is a heuristic and should be extracted to a named constant for clarity; add a
top-level constant (e.g., MAX_TOLERABLE_ERROR_BYTES: usize = 4) and replace the
literal in the size comparison (size > 4) with that constant, update the
existing comment to reference the constant name to preserve context, and ensure
the constant name is exported/visible in the module where has_structural_error
resides.
src/formatter/expr.rs (1)

984-987: Defensive cleanup for double dots.

The replace("..", ".") on line 986 works but indicates the join logic might be producing duplicate dots. The root cause appears to be that attr_name (line 972) prepends a dot while unnamed dot children (line 979) also add dots.

Consider addressing the root cause by not prepending the dot in attr_name when the previous part already ends with a dot, or by filtering unnamed dot children when a named attr_name follows.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/formatter/expr.rs` around lines 984 - 987, The join cleanup at the end
masks the root cause: both attr_name and unnamed dot children add dots,
producing duplicates; modify the logic that builds parts so it avoids double-dot
emission — specifically, when pushing attr_name check the previous part in parts
and do not prepend a '.' if parts.last().ends_with('.') (or drop unnamed dot
children immediately before a named attr_name); update the code that handles
unnamed dot children to skip adding "." when the next appended part is a named
attr_name, ensuring parts no longer contain adjacent dots so the final
result.replace("..", ".") becomes unnecessary.
src/formatter/stmt.rs (1)

946-956: Performance: O(n²) string allocation in dollar-quote search.

Line 948 allocates a new String from the remaining characters on each iteration while searching for the closing dollar-quote tag. For large dollar-quoted bodies (common in function definitions), this becomes O(n²).

Consider using a slice comparison or starts_with on a &str slice instead:

♻️ Suggested optimization
                 while i < len {
-                    let remaining: String = chars[i..].iter().collect();
-                    if remaining.starts_with(&tag) {
+                    // Check if we're at the closing tag
+                    let tag_chars: Vec<char> = tag.chars().collect();
+                    if i + tag_chars.len() <= len 
+                        && chars[i..i + tag_chars.len()] == tag_chars[..] 
+                    {
                         result.push_str(&tag);
                         i += tag.len();
                         break;
                     }

Or pre-collect the tag as Vec<char> once before the loop.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/formatter/stmt.rs` around lines 946 - 956, The loop that searches for the
closing dollar-quote allocates a new String each iteration by building
`remaining`, causing O(n²) behaviour; change it to compare slices without
allocation by either (a) precomputing the tag as a slice/Vec<char> and using
`chars[i..].starts_with(&tag_chars)` or (b) using the original input &str and
calling `&input[i..].starts_with(&tag)` so you avoid constructing `remaining`
each iteration; update the loop that references `tag_end`, `i`, `len`, `chars`,
`tag`, and `result` to use the slice comparison and keep pushing `chars[i]` (or
the corresponding byte/char) until the tag matches.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/formatter/expr.rs`:
- Around line 1044-1109: collapse_whitespace_outside_quotes currently only
tracks single and double quotes and will incorrectly collapse whitespace inside
dollar-quoted literals; modify collapse_whitespace_outside_quotes to detect and
skip dollar-quoted string blocks (patterns like $tag$...$tag$) similarly to
normalize_whitespace in stmt.rs: when you encounter a '$' start scanning ahead
to capture the tag (allowing empty tag), then copy characters unchanged until
you find the matching closing $tag$ sequence, keeping prev_was_space logic reset
when entering/exiting the dollar quote; update the function's state handling
(e.g., add a current_dollar_tag or in_dollar_quote flag) so dollar-quoted
contents are preserved.

---

Nitpick comments:
In `@src/formatter/expr.rs`:
- Around line 984-987: The join cleanup at the end masks the root cause: both
attr_name and unnamed dot children add dots, producing duplicates; modify the
logic that builds parts so it avoids double-dot emission — specifically, when
pushing attr_name check the previous part in parts and do not prepend a '.' if
parts.last().ends_with('.') (or drop unnamed dot children immediately before a
named attr_name); update the code that handles unnamed dot children to skip
adding "." when the next appended part is a named attr_name, ensuring parts no
longer contain adjacent dots so the final result.replace("..", ".") becomes
unnecessary.

In `@src/formatter/stmt.rs`:
- Around line 946-956: The loop that searches for the closing dollar-quote
allocates a new String each iteration by building `remaining`, causing O(n²)
behaviour; change it to compare slices without allocation by either (a)
precomputing the tag as a slice/Vec<char> and using
`chars[i..].starts_with(&tag_chars)` or (b) using the original input &str and
calling `&input[i..].starts_with(&tag)` so you avoid constructing `remaining`
each iteration; update the loop that references `tag_end`, `i`, `len`, `chars`,
`tag`, and `result` to use the slice comparison and keep pushing `chars[i]` (or
the corresponding byte/char) until the tag matches.

In `@src/lib.rs`:
- Around line 76-97: The magic number 4 in has_structural_error is a heuristic
and should be extracted to a named constant for clarity; add a top-level
constant (e.g., MAX_TOLERABLE_ERROR_BYTES: usize = 4) and replace the literal in
the size comparison (size > 4) with that constant, update the existing comment
to reference the constant name to preserve context, and ensure the constant name
is exported/visible in the module where has_structural_error resides.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 9899ea27-aaa0-4d83-951d-94586af5c06f

📥 Commits

Reviewing files that changed from the base of the PR and between 2c47197 and 91aad63.

📒 Files selected for processing (4)
  • Justfile
  • src/formatter/expr.rs
  • src/formatter/stmt.rs
  • src/lib.rs
✅ Files skipped from review due to trivial changes (1)
  • Justfile

- Set rust-version = "1.87.0" in Cargo.toml (required for let-chains,
  edition 2024 requires 1.85.0)
- Add dollar-quoted string handling to collapse_whitespace_outside_quotes
  in expr.rs to preserve internal whitespace in $tag$...$tag$ literals

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
gmr and others added 2 commits March 28, 2026 19:44
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@gmr gmr merged commit 3617865 into main Mar 28, 2026
3 checks passed
@gmr gmr deleted the feature/initial-library branch March 28, 2026 23:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant