Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions packages/parse/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
dist/
82 changes: 82 additions & 0 deletions packages/parse/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,82 @@
# pgsql-parse

<p align="center" width="100%">
<img height="250" src="https://raw.githubusercontent.com/constructive-io/constructive/refs/heads/main/assets/outline-logo.svg" />
</p>

Comment and whitespace preserving PostgreSQL parser. A drop-in enhancement for `pgsql-parser` that preserves SQL comments (`--` line and `/* */` block) and vertical whitespace (blank lines) through parse-deparse round trips.

## Installation

```sh
npm install pgsql-parse
```

## Features

* **Comment Preservation** -- Retains `--` line comments and `/* */` block comments through parse-deparse cycles
* **Vertical Whitespace** -- Preserves blank lines between statements for readable output
* **Idempotent Round-Trips** -- `parse -> deparse -> parse -> deparse` produces identical output
* **Drop-in API** -- Re-exports `parse`, `parseSync`, `deparse`, `deparseSync`, `loadModule` from `pgsql-parser`
* **Synthetic AST Nodes** -- `RawComment` and `RawWhitespace` nodes interleaved into the `stmts` array by byte position

## How It Works

1. A pure TypeScript scanner extracts comment and whitespace tokens with byte positions from the raw SQL text
2. Enhanced `parse`/`parseSync` call the standard `libpg-query` parser, then interleave synthetic `RawComment` and `RawWhitespace` nodes into the `stmts` array based on byte position
3. `deparseEnhanced()` dispatches on node type -- real `RawStmt` entries go through the standard deparser, while synthetic nodes emit their comment text or blank lines directly

## API

### Enhanced Parse

```typescript
import { parse, parseSync, deparseEnhanced, loadModule } from 'pgsql-parse';

// Async (handles initialization automatically)
const result = await parse(`
-- Create users table
CREATE TABLE users (id serial PRIMARY KEY);

-- Create posts table
CREATE TABLE posts (id serial PRIMARY KEY);
`);

// result.stmts contains RawComment, RawWhitespace, and RawStmt nodes
const sql = deparseEnhanced(result);
// Output preserves comments and blank lines
```

### Sync Methods

```typescript
import { parseSync, deparseEnhanced, loadModule } from 'pgsql-parse';

await loadModule();

const result = parseSync('-- comment\nSELECT 1;');
const sql = deparseEnhanced(result);
```

### Type Guards

```typescript
import { isRawComment, isRawWhitespace, isRawStmt } from 'pgsql-parse';

for (const stmt of result.stmts) {
if (isRawComment(stmt)) {
console.log('Comment:', stmt.RawComment.text);
} else if (isRawWhitespace(stmt)) {
console.log('Blank lines:', stmt.RawWhitespace.lines);
} else if (isRawStmt(stmt)) {
console.log('Statement:', stmt);
}
}
```

## Credits

Built on the excellent work of several contributors:

* **[Dan Lynch](https://github.com/pyramation)** -- official maintainer since 2018 and architect of the current implementation
* **[Lukas Fittl](https://github.com/lfittl)** for [libpg_query](https://github.com/pganalyze/libpg_query) -- the core PostgreSQL parser that powers this project
223 changes: 223 additions & 0 deletions packages/parse/__tests__/__snapshots__/roundtrip.test.ts.snap
Original file line number Diff line number Diff line change
@@ -0,0 +1,223 @@
// Jest Snapshot v1, https://jestjs.io/docs/snapshot-testing

exports[`fixture round-trip (CST) alter-and-drop deparsed output matches snapshot 1`] = `
"-- Add columns to existing table
ALTER TABLE app.users
ADD COLUMN bio text;
ALTER TABLE app.users
ADD COLUMN avatar_url text;

-- Rename a column
ALTER TABLE app.users RENAME COLUMN username TO display_name;

-- Drop unused objects
DROP INDEX IF EXISTS app.idx_old_index;

-- Recreate with new definition
CREATE INDEX idx_users_display_name ON app.users (display_name);"
`;

exports[`fixture round-trip (CST) edge-cases deparsed output matches snapshot 1`] = `
"-- Comments with special characters: don't break "parsing"
SELECT 1;

-- Inline comment after statement
SELECT 2; -- trailing note

-- Adjacent comments with no blank line
-- first line
-- second line
SELECT 3;

-- Dollar-quoted body with internal comments (should NOT be extracted)
CREATE FUNCTION app.noop() RETURNS void AS $$
BEGIN
-- this comment is inside the function body
NULL;
END;
$$ LANGUAGE plpgsql;

-- String that looks like a comment
SELECT '-- not a comment' AS val;

-- Empty statement list edge
SELECT 4;"
`;

exports[`fixture round-trip (CST) grants-and-policies deparsed output matches snapshot 1`] = `
"-- RLS policies for the users table
ALTER TABLE app.users
ENABLE ROW LEVEL SECURITY;

-- Admins can see all rows
CREATE POLICY admin_all
ON app.users
AS PERMISSIVE
FOR ALL
TO admin_role
USING (
true
);

-- Users can only see their own row
CREATE POLICY own_row
ON app.users
AS PERMISSIVE
FOR SELECT
TO authenticated
USING (
id = (current_setting('app.current_user_id'))::int
);

-- Grant basic access
GRANT USAGE ON SCHEMA app TO authenticated;
GRANT SELECT ON app.users TO authenticated;
GRANT ALL ON app.users TO admin_role;"
`;

exports[`fixture round-trip (CST) mid-statement-comments deparsed output matches snapshot 1`] = `
"-- Mid-statement comments are hoisted above their enclosing statement.
-- The deparser cannot inject comments back into the middle of a
-- statement, so they are preserved as standalone lines above it.

-- Simple mid-statement comment
-- the primary key
SELECT
id,
name
FROM users;

-- Multiple mid-statement comments in one query
-- user ID
-- display name
-- role from join
SELECT
u.id,
u.name,
r.role_name
FROM users AS u
JOIN roles AS r ON r.id = u.role_id;

-- Mid-statement comment in INSERT values
-- log level
-- log body
INSERT INTO logs (
level,
message
) VALUES
('info', 'hello');

-- Comment between clauses
-- filter active only
SELECT id
FROM users
WHERE
active = true;"
`;

exports[`fixture round-trip (CST) multi-statement deparsed output matches snapshot 1`] = `
"-- Schema setup
CREATE SCHEMA IF NOT EXISTS app;

-- Users table
CREATE TABLE app.users (
id serial PRIMARY KEY,
username text NOT NULL,
created_at timestamptz DEFAULT now()
);

-- Roles table
CREATE TABLE app.roles (
id serial PRIMARY KEY,
name text UNIQUE NOT NULL
);

-- Junction table
CREATE TABLE app.user_roles (
user_id int REFERENCES app.users (id),
role_id int REFERENCES app.roles (id),
PRIMARY KEY (user_id, role_id)
);

-- Seed default roles
INSERT INTO app.roles (
name
) VALUES
('admin'),
('viewer');"
`;

exports[`fixture round-trip (CST) pgpm-header deparsed output matches snapshot 1`] = `
"-- Deploy schemas/my-app/tables/users to pg
-- requires: schemas/my-app/schema

BEGIN;

-- Create the main users table
CREATE TABLE my_app.users (
id serial PRIMARY KEY,
name text NOT NULL,
email text UNIQUE
);

-- Add an index for fast lookups
CREATE INDEX idx_users_email ON my_app.users (email);

COMMIT;"
`;

exports[`fixture round-trip (CST) plpgsql-function deparsed output matches snapshot 1`] = `
"-- Deploy schemas/app/functions/get_user to pg
-- requires: schemas/app/tables/users

BEGIN;

-- Function to get a user by ID
CREATE FUNCTION app.get_user(
p_id int
) RETURNS TABLE (
id int,
username text,
created_at timestamptz
) AS $$
BEGIN
-- Return the matching user
RETURN QUERY
SELECT u.id, u.username, u.created_at
FROM app.users u
WHERE u.id = p_id;
END;
$$ LANGUAGE plpgsql STABLE;

-- Grant execute to authenticated users
GRANT EXECUTE ON FUNCTION app.get_user(int) TO authenticated;

COMMIT;"
`;

exports[`fixture round-trip (CST) views-and-triggers deparsed output matches snapshot 1`] = `
"-- Active users view
CREATE VIEW app.active_users AS SELECT
id,
username,
created_at
FROM app.users
WHERE
created_at > (now() - '90 days'::interval);

-- Audit trigger function
CREATE FUNCTION app.audit_trigger() RETURNS trigger AS $$
BEGIN
INSERT INTO app.audit_log (table_name, action, row_id)
VALUES (TG_TABLE_NAME, TG_OP, NEW.id);
RETURN NEW;
END;
$$ LANGUAGE plpgsql;

-- Attach trigger to users table
CREATE TRIGGER users_audit
AFTER INSERT OR UPDATE
ON app.users
FOR EACH ROW
EXECUTE PROCEDURE app.audit_trigger();"
`;
12 changes: 12 additions & 0 deletions packages/parse/__tests__/fixtures/alter-and-drop.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
-- Add columns to existing table
ALTER TABLE app.users ADD COLUMN bio text;
ALTER TABLE app.users ADD COLUMN avatar_url text;

-- Rename a column
ALTER TABLE app.users RENAME COLUMN username TO display_name;

-- Drop unused objects
DROP INDEX IF EXISTS app.idx_old_index;

-- Recreate with new definition
CREATE INDEX idx_users_display_name ON app.users (display_name);
24 changes: 24 additions & 0 deletions packages/parse/__tests__/fixtures/edge-cases.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
-- Comments with special characters: don't break "parsing"
SELECT 1;

-- Inline comment after statement
SELECT 2; -- trailing note

-- Adjacent comments with no blank line
-- first line
-- second line
SELECT 3;

-- Dollar-quoted body with internal comments (should NOT be extracted)
CREATE FUNCTION app.noop() RETURNS void AS $$
BEGIN
-- this comment is inside the function body
NULL;
END;
$$ LANGUAGE plpgsql;

-- String that looks like a comment
SELECT '-- not a comment' AS val;

-- Empty statement list edge
SELECT 4;
19 changes: 19 additions & 0 deletions packages/parse/__tests__/fixtures/grants-and-policies.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
-- RLS policies for the users table
ALTER TABLE app.users ENABLE ROW LEVEL SECURITY;

-- Admins can see all rows
CREATE POLICY admin_all ON app.users
FOR ALL
TO admin_role
USING (true);

-- Users can only see their own row
CREATE POLICY own_row ON app.users
FOR SELECT
TO authenticated
USING (id = current_setting('app.current_user_id')::integer);

-- Grant basic access
GRANT USAGE ON SCHEMA app TO authenticated;
GRANT SELECT ON app.users TO authenticated;
GRANT ALL ON app.users TO admin_role;
30 changes: 30 additions & 0 deletions packages/parse/__tests__/fixtures/mid-statement-comments.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
-- Mid-statement comments are hoisted above their enclosing statement.
-- The deparser cannot inject comments back into the middle of a
-- statement, so they are preserved as standalone lines above it.

-- Simple mid-statement comment
SELECT
id, -- the primary key
name
FROM users;

-- Multiple mid-statement comments in one query
SELECT
u.id, -- user ID
u.name, -- display name
r.role_name -- role from join
FROM users u
JOIN roles r ON r.id = u.role_id;

-- Mid-statement comment in INSERT values
INSERT INTO logs (level, message)
VALUES (
'info', -- log level
'hello' -- log body
);

-- Comment between clauses
SELECT id
FROM users
-- filter active only
WHERE active = true;
Loading
Loading