Undo by gburd · Pull Request #21 · gburd/postgres

gburd · 2026-03-26T19:14:14Z

No description provided.

- Hourly upstream sync from postgres/postgres (24x daily) - AI-powered PR reviews using AWS Bedrock Claude Sonnet 4.5 - Multi-platform CI via existing Cirrus CI configuration - Cost tracking and comprehensive documentation Features: - Automatic issue creation on sync conflicts - PostgreSQL-specific code review prompts (C, SQL, docs, build) - Cost limits: $15/PR, $200/month - Inline PR comments with security/performance labels - Skip draft PRs to save costs Documentation: - .github/SETUP_SUMMARY.md - Quick setup overview - .github/QUICKSTART.md - 15-minute setup guide - .github/PRE_COMMIT_CHECKLIST.md - Verification checklist - .github/docs/ - Detailed guides for sync, AI review, Bedrock See .github/README.md for complete overview Complete Phase 3: Windows builds + fix sync for CI/CD commits Phase 3: Windows Dependency Build System - Implement full build workflow (OpenSSL, zlib, libxml2) - Smart caching by version hash (80% cost reduction) - Dependency bundling with manifest generation - Weekly auto-refresh + manual triggers - PowerShell download helper script - Comprehensive usage documentation Sync Workflow Fix: - Allow .github/ commits (CI/CD config) on master - Detect and reject code commits outside .github/ - Merge upstream while preserving .github/ changes - Create issues only for actual pristine violations Documentation: - Complete Windows build usage guide - Update all status docs to 100% complete - Phase 3 completion summary All three CI/CD phases complete (100%): ✅ Hourly upstream sync with .github/ preservation ✅ AI-powered PR reviews via Bedrock Claude 4.5 ✅ Windows dependency builds with smart caching Cost: $40-60/month total See .github/PHASE3_COMPLETE.md for details Fix sync to allow 'dev setup' commits on master The sync workflow was failing because the 'dev setup v19' commit modifies files outside .github/. Updated workflows to recognize commits with messages starting with 'dev setup' as allowed on master. Changes: - Detect 'dev setup' commits by message pattern (case-insensitive) - Allow merge if commits are .github/ OR dev setup OR both - Update merge messages to reflect preserved changes - Document pristine master policy with examples This allows personal development environment commits (IDE configs, debugging tools, shell aliases, Nix configs, etc.) on master without violating the pristine mirror policy. Future dev environment updates should start with 'dev setup' in the commit message to be automatically recognized and preserved. See .github/docs/pristine-master-policy.md for complete policy See .github/DEV_SETUP_FIX.md for fix summary Optimize CI/CD costs by skipping builds for pristine commits Add cost optimization to Windows dependency builds to avoid expensive builds when only pristine commits are pushed (dev setup commits or .github/ configuration changes). Changes: - Add check-changes job to detect pristine-only pushes - Skip Windows builds when all commits are dev setup or .github/ only - Add comprehensive cost optimization documentation - Update README with cost savings (~40% reduction) Expected savings: ~$3-5/month on Windows builds, ~$40-47/month total through combined optimizations. Manual dispatch and scheduled builds always run regardless.

This commit adds the core UNDO logging system for PostgreSQL, implementing ZHeap-inspired physical UNDO with Compensation Log Records (CLRs) for crash-safe transaction rollback and standby replication support. Key features: - Physical UNDO application using memcpy() for direct page modification - CLR (Compensation Log Record) generation during transaction rollback - Shared buffer integration (UNDO pages use standard buffer pool) - UndoRecordSet architecture with chunk-based organization - UNDO worker for automatic cleanup of old records - Per-persistence-level record sets (permanent/unlogged/temp) Architecture: - UNDO logs stored in $PGDATA/base/undo/ with 64-bit UndoRecPtr - 40-bit offset (1TB per log) + 24-bit log number (16M logs) - Integrated with PostgreSQL's shared_buffers (no separate cache) - WAL-logged CLRs ensure crash safety and standby replay

Extends UNDO adding a per-relation model that can record logical operations for the purposed of recovery or in support of MVCC visibility tracking. Unlike cluster-wide UNDO (which stores complete tuple data globally), per-relation UNDO stores logical operation metadata in a relation-specific UNDO fork. Architecture: - Separate UNDO fork per relation (relfilenode.undo) - Metapage (block 0) tracks head/tail/free chain pointers - Data pages contain UNDO records with operation metadata - WAL resource manager (RM_RELUNDO_ID) for crash recovery - Two-phase protocol: RelUndoReserve() / RelUndoFinish() / RelUndoCancel() Record types: - RELUNDO_INSERT: Tracks inserted TID range - RELUNDO_DELETE: Tracks deleted TID - RELUNDO_UPDATE: Tracks old/new TID pair - RELUNDO_TUPLE_LOCK: Tracks tuple lock acquisition - RELUNDO_DELTA_INSERT: Tracks columnar delta insertion Table AM integration: - relation_init_undo: Create UNDO fork during CREATE TABLE - tuple_satisfies_snapshot_undo: MVCC visibility via UNDO chain - relation_vacuum_undo: Discard old UNDO records during VACUUM This complements cluster-wide UNDO by providing table-AM-specific UNDO management without global coordination overhead.

Implements a minimal table access method that exercises the per-relation UNDO subsystem. Validates end-to-end functionality: UNDO fork creation, record insertion, chain walking, and crash recovery. Implemented operations: - INSERT: Full implementation with UNDO record creation - Sequential scan: Forward-only table scan - CREATE/DROP TABLE: UNDO fork lifecycle management - VACUUM: UNDO record discard This test AM stores tuples in simple heap-like pages using custom TestUndoTamTupleHeader (t_len, t_xmin, t_self) followed by MinimalTuple data. Pages use standard PageHeaderData and PageAddItem(). Two-phase UNDO protocol demonstration: 1. Insert tuple onto data page (PageAddItem) 2. Reserve UNDO space (RelUndoReserve) 3. Build UNDO record (header + payload) 4. Commit UNDO record (RelUndoFinish) 5. Register for rollback (RegisterPerRelUndo) Introspection: - test_undo_tam_dump_chain(regclass): Walk UNDO fork, return all records Testing: - sql/undo_tam.sql: Basic INSERT/scan operations - t/058_undo_tam_crash.pl: Crash recovery validation This test module is NOT suitable for production use. It serves only to validate the per-relation UNDO infrastructure and demonstrate table AM integration patterns.

Extends per-relation UNDO from metadata-only (MVCC visibility) to supporting transaction rollback. When a transaction aborts, per-relation UNDO chains are applied asynchronously by background workers. Architecture: - Async-only rollback via background worker pool - Work queue protected by RelUndoWorkQueueLock - Catalog access safe in worker (proper transaction state) - Test helper (RelUndoProcessPendingSync) for deterministic testing Extended data structures: - RelUndoRecordHeader gains info_flags and tuple_len - RELUNDO_INFO_HAS_TUPLE flag indicates tuple data present - RELUNDO_INFO_HAS_CLR / CLR_APPLIED for crash safety Rollback operations: - RELUNDO_INSERT: Mark inserted tuples as LP_UNUSED - RELUNDO_DELETE: Restore deleted tuple via memcpy (stored in UNDO) - RELUNDO_UPDATE: Restore old tuple version (stored in UNDO) - RELUNDO_TUPLE_LOCK: Remove lock marker - RELUNDO_DELTA_INSERT: Restore original column data Transaction integration: - RegisterPerRelUndo: Track relation UNDO chains per transaction - GetPerRelUndoPtr: Chain UNDO records within relation - ApplyPerRelUndo: Queue work for background workers on abort - StartRelUndoWorker: Spawn worker if none running Async rationale: Per-relation UNDO cannot apply synchronously during ROLLBACK because catalog access (relation_open) is not allowed during TRANS_ABORT state. Background workers execute in proper transaction context, avoiding the constraint. This matches the ZHeap architecture where UNDO application is deferred to background processes. WAL: - XLOG_RELUNDO_APPLY: Compensation log records (CLRs) for applied UNDO - Prevents double-application after crash recovery Testing: - sql/undo_tam_rollback.sql: Validates INSERT rollback - test_undo_tam_process_pending(): Drain work queue synchronously

Implements production-ready WAL features for the per-relation UNDO resource manager: async I/O, consistency checking, parallel redo, and compression validation. Async I/O optimization: When INSERT records reference both data page (block 0) and metapage (block 1), issue prefetch for block 1 before reading block 0. This allows both I/Os to proceed in parallel, reducing crash recovery stall time. Uses pgaio batch mode when io_method is worker or io_uring. Pattern: if (has_metapage && io_method != IOMETHOD_SYNC) pgaio_enter_batchmode(); relundo_prefetch_block(record, 1); // Start async read process_block_0(); // Overlaps with metapage I/O process_block_1(); // Should be in cache pgaio_exit_batchmode(); Consistency checking: All redo functions validate WAL record fields before application: - Bounds checks: offsets < BLCKSZ, counters within range - Monotonicity: counters advance, pd_lower increases - Cross-field validation: record fits within page - Type validation: record types in valid range - Post-condition checks: updated values are reasonable Parallel redo support: Implements startup/cleanup/mask callbacks required for multi-core crash recovery: - relundo_startup: Initialize per-backend state - relundo_cleanup: Release per-backend resources - relundo_mask: Mask LSN, checksum, free space for page comparison Page dependency rules: - Different pages replay in parallel (no ordering constraints) - Same page: INIT precedes INSERT (enforced by page LSN) - Metapage updates are sequential (buffer lock serialization) Compression validation: WAL compression (wal_compression GUC) automatically compresses full page images via XLogCompressBackupBlock(). Test validates 40-46% reduction for RELUNDO FPIs with lz4, pglz, and zstd algorithms. Test: t/059_relundo_wal_compression.pl measures WAL volume with/without compression for identical workloads.

This commit adds the FILEOPS subsystem, providing transactional file operations with WAL logging and crash recovery support. FILEOPS is independent of the UNDO logging system and can be used standalone. Key features: - Transactional file operations (create, delete, rename, truncate) - WAL logging for crash recovery and standby replication - Automatic cleanup of failed operations - Integration with PostgreSQL's resource manager system File operations: - FileOpsCreate(path): Create file transactionally - FileOpsDelete(path): Delete file transactionally - FileOpsRename(oldpath, newpath): Rename file transactionally - FileOpsTruncate(path, size): Truncate file transactionally All operations are WAL-logged with XLOG_FILEOPS_* record types and replayed correctly during recovery and on standby servers. Use cases: - Transactional log file management - UNDO log file operations - Any subsystem needing crash-safe file operations

Adds opt-in UNDO support to the standard heap table access method. When enabled, heap operations write UNDO records to enable physical rollback without scanning the heap, and support UNDO-based MVCC visibility determination. How heap uses UNDO: INSERT operations: - Before inserting tuple, call PrepareXactUndoData() to reserve UNDO space - Write UNDO record with: transaction ID, tuple TID, old tuple data (null for INSERT) - On abort: UndoReplay() marks tuple as LP_UNUSED without heap scan UPDATE operations: - Write UNDO record with complete old tuple version before update - On abort: UndoReplay() restores old tuple version from UNDO DELETE operations: - Write UNDO record with complete deleted tuple data - On abort: UndoReplay() resurrects tuple from UNDO record MVCC visibility: - Tuples reference UNDO chain via xmin/xmax - HeapTupleSatisfiesSnapshot() can walk UNDO chain for older versions - Enables reconstructing tuple state as of any snapshot Configuration: CREATE TABLE t (...) WITH (enable_undo=on); The enable_undo storage parameter is per-table and defaults to off for backward compatibility. When disabled, heap behaves exactly as before. Value proposition: 1. Faster rollback: No heap scan required, UNDO chains are sequential - Traditional abort: Full heap scan to mark tuples invalid (O(n) random I/O) - UNDO abort: Sequential UNDO log scan (O(n) sequential I/O, better cache locality) 2. Cleaner abort handling: UNDO records are self-contained - No need to track which heap pages were modified - Works across crashes (UNDO is WAL-logged) 3. Foundation for future features: - Multi-version concurrency control without bloat - Faster VACUUM (can discard entire UNDO segments) - Point-in-time recovery improvements Trade-offs: Costs: - Additional writes: Every DML writes both heap + UNDO (roughly 2x write amplification) - UNDO log space: Requires space for UNDO records until no longer visible - Complexity: New GUCs (undo_retention, max_undo_workers), monitoring needed Benefits: - Primarily valuable for workloads with: - Frequent aborts (e.g., speculative execution, deadlocks) - Long-running transactions needing old snapshots - Hot UPDATE workloads benefiting from cleaner rollback Not recommended for: - Bulk load workloads (COPY: 2x write amplification without abort benefit) - Append-only tables (rare aborts mean cost without benefit) - Space-constrained systems (UNDO retention increases storage) When beneficial: - OLTP with high abort rates (>5%) - Systems with aggressive pruning needs (frequent VACUUM) - Workloads requiring historical visibility (audit, time-travel queries) Integration points: - heap_insert/update/delete call PrepareXactUndoData/InsertXactUndoData - Heap pruning respects undo_retention to avoid discarding needed UNDO - pg_upgrade compatibility: UNDO disabled for upgraded tables Background workers: - Cluster-wide UNDO has async workers for cleanup/discard of old UNDO records - Rollback itself is synchronous (via UndoReplay() during transaction abort) - Workers periodically trim UNDO logs based on undo_retention and snapshot visibility This demonstrates cluster-wide UNDO in production use. Note that this differs from per-relation logical UNDO (added in subsequent patches), which uses per-table UNDO forks and async rollback via background workers.

This commit provides examples and architectural documentation for the UNDO subsystems. It is intended for reviewers and committers to understand the design decisions and usage patterns. Contents: - 01-basic-undo-setup.sql: Cluster-wide UNDO basics - 02-undo-rollback.sql: Rollback demonstrations - 03-undo-subtransactions.sql: Subtransaction handling - 04-transactional-fileops.sql: FILEOPS usage - 05-undo-monitoring.sql: Monitoring and statistics - 06-per-relation-undo.sql: Per-relation UNDO with test_undo_tam - DESIGN_NOTES.md: Comprehensive architecture documentation - README.md: Examples overview This commit should NOT be merged. It exists only to provide context and documentation for the patch series.

gburd added 10 commits March 26, 2026 17:23

dev setup v27

b949e4f

github-actions bot force-pushed the master branch 20 times, most recently from 04238d1 to 917b7b5 Compare March 30, 2026 01:34

github-actions bot force-pushed the master branch 5 times, most recently from af40d42 to f9a9303 Compare March 30, 2026 08:31

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Undo#21

Undo#21
gburd wants to merge 10 commits intomasterfrom
undo

gburd commented Mar 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

gburd commented Mar 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant