Skip to content

Add document assembly mode, configuration save/load, and unit tests#2

Draft
bencarver wants to merge 21 commits intojamietso:mainfrom
bencarver:main
Draft

Add document assembly mode, configuration save/load, and unit tests#2
bencarver wants to merge 21 commits intojamietso:mainfrom
bencarver:main

Conversation

@bencarver
Copy link
Copy Markdown

Summary

This PR adds three major features to Signature Packet IDE:

  1. Document Assembly Mode — Match executed/signed PDFs to blank signature pages and assemble final documents
  2. Configuration Save/Load — Save entire sessions (extracted pages, edits, matches) with bundled PDFs for instant restore
  3. Unit Tests — 13 Vitest tests covering the matching algorithm

Features

Document Assembly

  • Executed Page Matching: Upload signed PDFs; AI identifies which signature pages they correspond to
  • Auto-Match: Intelligently matches executed pages to blank pages by document name, party, and signatory (Jaccard similarity-based scoring)
  • Assembly Progress Grid: Visual checklist organized by signatory (columns) × document (rows)
  • Manual Override: Click any cell to manually assign or reassign pages
  • Assemble & Download: Generate final PDFs with blank pages swapped for executed counterparts

Configuration Save/Load

  • Save Session: Export entire session (extracted pages, user edits, assembly matches) as .json
  • Bundled PDFs: Original PDF files embedded as base64 — no re-upload needed on restore
  • Instant Restore: Load a saved config and continue working immediately
  • Backward Compatible: Old configs without bundled PDFs still load (with "restored" status, prompting re-upload if needed)
  • Privacy-First: All processing done in-browser

Testing

  • Vitest Integration: Zero-config test runner compatible with Vite
  • 13 Unit Tests: Cover matching algorithm comprehensively:
    • Exact matches, fuzzy matching (case/whitespace), containment matching
    • Threshold filtering, greedy uniqueness (each page matched once)
    • User-confirmed/user-overridden match preservation
    • Empty input handling

Changes

  • Added Assembly tab in toolbar for assembly mode
  • Added Save Config / Load Config buttons in header
  • New services: matchingService.ts (with matchingService.test.ts)
  • Updated App.tsx with assembly handlers, config save/load logic
  • Updated types.ts with assembly-related types (ExecutedUpload, AssemblyMatch, SavedConfiguration)
  • New components: CompletionChecklist.tsx, ExecutedPageCard.tsx, MatchPickerModal.tsx
  • Updated README.md with Assembly Mode and Save & Restore documentation
  • Added .env to .gitignore to prevent API key leaks
  • Fixed environment variable references (API_KEYGEMINI_API_KEY)
  • Fixed npm run command in docs (npm startnpm run dev)

Testing

All 13 unit tests pass:

Tested locally with:

  • Multiple agreement PDFs
  • Executed/signed PDFs with various signature block formats
  • Save/load with bundled PDFs
  • Assembly and document generation

Notes for Reviewers

  • The matching algorithm uses a weighted score (50% doc name, 35% party, 15% signatory) with a 0.4 threshold
  • User-confirmed matches are preserved across re-scans
  • The save feature includes full assembly state (matches, executed uploads) for complete session restore

Ben Carver and others added 21 commits February 20, 2026 08:17
- Add SavedConfiguration interface to serialize extraction state as JSON
- Implement handleSaveConfiguration to export signature pages, edits, and grouping mode
- Implement handleLoadConfiguration to restore previously saved configs
- Add smart rescan+merge: when restored PDFs are re-uploaded, run fresh AI extraction while preserving user edits (party name, signatory, capacity, copies)
- Use wasRestored flag and useEffect to route restored docs through merge-rescan flow
- Add "Needs file" UI badge for restored documents awaiting re-upload
- Improve Gemini prompt to correctly extract individual signatories from role-labeled blocks
- Add Save/Load Config buttons to header
- Fix duplicate file upload in React StrictMode by moving state computation outside setter

Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
- Add Assembly mode: upload executed/signed pages, AI analysis per page
- Auto-match executed pages to blank signature pages via weighted similarity
- Completion checklist matrix (documents x parties) with manual override
- PDF assembly: replace blank sig pages with executed versions at exact page positions
- Improve Gemini prompts for LP/fund/trust multi-level entity chains (LP -> GP -> Individual)
- Add retry logic when partyName or signatoryName is empty after first AI pass
- Add debug console logging for extraction diagnosis

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- CompletionChecklist: one column per signatory (not per party), with
  party names shown as sub-labels in the header; draggable columns for
  reordering; multiple buttons per cell when a signatory signs for
  multiple parties on the same document
- InstructionsModal: group signing instructions by signatory with party
  shown as a column, matching the grid layout
- App.tsx: add min-w-0 / overflow-auto fixes to allow horizontal scroll
  in the assembly grid

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…gress view

- Save/load configuration and smart rescan for signature extraction
- Document assembly mode: upload executed pages, AI analysis, auto-matching
- Completion checklist matrix with manual override and PDF assembly/download
- Assembly progress view reorganized by signatory with draggable columns

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ly state

- Replace full AI re-extraction on config restore with thumbnail-only rendering
  from saved pages, eliminating unnecessary API calls and delays
- Embed PDF files as base64 in saved config so re-uploading isn't needed
- Persist assembly state (executed uploads, matches) in saved config
- Fix stale closure issues by replacing useEffect-based restore trigger with
  direct call from handleFileUpload using restoringIds ref guard
- Fix TypeScript error in CompletionChecklist partiesBySignatory Set generic

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
13 tests covering:
- Exact and fuzzy matching (case, whitespace, containment)
- isConfirmedExecuted guard
- Score threshold filtering
- Greedy assignment (each page matched at most once)
- Preserved user-confirmed / user-overridden matches
- Auto-matched entries are re-computed (not preserved)
- createManualMatch field mapping
- Empty input edge cases

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Support DOCX uploads by converting through a server-side Microsoft Graph service, and surface detailed conversion errors in the UI to speed up auth and permission troubleshooting.

Made-with: Cursor
Allow users to manually replace a document with a differently named updated file while preserving prior extracted pages and appending new detections from the latest version.

Made-with: Cursor
Constrain assembly layout to a local horizontal scroller with explicit controls and make document labels wrap so long filenames remain readable in both document lists and checklist rows.

Made-with: Cursor
When replacing a document version, re-analyze known page indices whose thumbnails changed and refresh extracted metadata. Add inline rename for documents and executed uploads with propagation to extracted pages and assembly matches.

Made-with: Cursor
- Dockerfile multi-stage (Vite + compiled Express), .dockerignore
- Backend: M365 DOCX→PDF, static SPA, Gemini via /api/gemini/* (GEMINI_API_KEY server-only)
- geminiAnalyze.ts; client geminiService uses fetch; vite no longer injects API key
- docs/DEPLOY_CLOUD_RUN.md: IAP, Secret Manager, dedicated SA script, placeholders
- scripts/deploy-cloud-run.sh, setup-cloud-run-runtime-sa.sh; .env.deploy.example
- Favicon + header logo; index.html link tags
- Quieter Gemini logs unless DEBUG_GEMINI=1; remove stray console.log in App

Made-with: Cursor
…lient

- SignatureCard: show full document name (wrap, no truncate)
- ExecutedPageCard: click thumbnail for full-page PDF preview
- MatchPickerModal: preview blank/current/candidates; fix nested buttons;
  preselect current match; explicit Select; PdfPreviewModal z-index above modal
- geminiAnalyze: GoogleGenAI({ apiKey, vertexai: false }) so GOOGLE_GENAI_USE_VERTEXAI
  does not send API key to Vertex (401)
- DEPLOY_CLOUD_RUN: troubleshoot note for Vertex vs API key conflict

Made-with: Cursor
- Assembly: download ZIP of unmatched blank sig pages, optional signatory filter
- CompletionChecklist: sync signatory columns in useEffect (avoid setState-in-render loop)
- index: RootErrorBoundary for clearer failures than blank screen
- App: fix useEffect deps typo (missingSignatoryOptions); co-locate missing-pack state

Made-with: Cursor
… match picker

- When keyword scan finds no candidate pages, analyze all pages with vision
  (initial ingest and version-merge) with status and console warn for large PDFs
- Escape closes PDF preview first, then Reassign/Match dialog
- Match picker: wrap long document and source names instead of truncating

Made-with: Cursor
Adds direct match actions from awaiting executed pages, prevents rematching after auto-match removals, preserves multi-party page behavior, and makes Adobe Sign-style PDFs resilient via preview and assembly raster fallbacks with better diagnostics/perf.

Made-with: Cursor
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant