Turn any PDF or image document into structured data for your AI. A powerful, lightweight OCR toolkit that bridges the gap between images/PDFs and LLMs. Supports 100+ languages.
-
Updated
Feb 28, 2026 - Python
Turn any PDF or image document into structured data for your AI. A powerful, lightweight OCR toolkit that bridges the gap between images/PDFs and LLMs. Supports 100+ languages.
📝 Manage your projects and notes locally with Ironpad, a file-based system that keeps your data safe in Markdown format without cloud reliance.
Add a description, image, and links to the pp-ocr topic page so that developers can more easily learn about it.
To associate your repository with the pp-ocr topic, visit your repo's landing page and select "manage topics."