Skip to content

Releases: EricRollei/PDF-Tools

v1.0.0 PDF Tools v1.0.0 - Initial Release

20 Nov 06:20

Choose a tag to compare

Release Notes

v1.0.0 - Initial Release (November 19, 2025)

Overview

First public release of PDF Tools for ComfyUI - a comprehensive suite of PDF processing, OCR, and AI vision analysis nodes.

Features

PDF Extraction

  • PDF Extractor v08/v09 - Advanced image extraction with quality assessment
    • Automatic spread detection for scanned books
    • Image quality scoring (sharpness, contrast, brightness)
    • Duplicate detection
    • Organize output by quality
    • JSON metadata export
  • Simple PDF Extractor - Basic PDF image extraction

OCR Nodes

  • Surya OCR Layout Node - Multilingual OCR with 90+ languages
    • Advanced layout detection
    • Reading order analysis
    • Table detection
    • Multiple output formats (text, JSON, markdown)
  • PaddleOCR VL Remote - Visual-Language OCR
    • Requires separate virtual environment (see PaddleOCR_VL_SETUP.md)
    • CUDA 12.6 support

AI Vision & Layout Analysis

  • Florence-2 Cropper - AI-powered image cropping and region detection
  • LayoutLMv3 Node - Microsoft's document understanding model
  • Enhanced Layout Parser v06 - Advanced document layout analysis
  • Rectangle Detector - Geometric shape detection

Technical Details

Dependencies

  • Python 3.11.6+
  • PyMuPDF (AGPL v3)
  • Surya OCR (GPL v3)
  • Florence-2 (MIT)
  • transformers
  • torch
  • pillow
  • numpy

Installation

cd ComfyUI/custom_nodes/PDF_tools
.\install.ps1

Verification

.\check_install.ps1

Documentation

Complete documentation included:

  • README.md - Main documentation
  • INSTALLATION_GUIDE.md - Detailed installation instructions
  • QUICKSTART_SURYA.md - Quick start for Surya OCR
  • SURYA_OCR_NODE_GUIDE.md - Complete Surya guide
  • PaddleOCR_VL_SETUP.md - Separate venv setup for PaddleOCR
  • BATCH_PROCESSING_GUIDE.md - Batch processing workflows
  • PDF_LAYER_DETECTION_GUIDE.md - Layer detection details
  • LAYER_DETECTION_QUICKREF.md - Quick reference
  • CODE_OVERVIEW.md - Code structure
  • CONTRIBUTING.md - Contribution guidelines
  • CREDITS.md - Dependencies and licenses

License

Dual License:

  • Non-Commercial: Creative Commons Attribution-NonCommercial 4.0 International (CC BY-NC 4.0)
  • Commercial: Requires separate commercial license

Package Split Notice

Download functionality (gallery-dl, yt-dlp) has been separated into a standalone package called "download-tools". This package now focuses exclusively on PDF processing, OCR, and vision analysis.

Repository

Installation from GitHub

cd ComfyUI/custom_nodes
git clone https://github.com/EricRollei/PDF-Tools.git PDF_tools
cd PDF_tools
.\install.ps1

Known Issues

  • PaddleOCR VL requires separate virtual environment due to CUDA version conflicts (12.6 vs 12.8)
  • See PaddleOCR_VL_SETUP.md for detailed setup instructions

Contributors

Created and maintained by Eric Rollei

Statistics

  • 111 files
  • 69,115 lines of code
  • 15+ processing nodes
  • 30+ documented dependencies
  • Complete license headers on all Python files