Open Source OCR Engine
Awesome multilingual OCR toolkits based on PaddlePaddle
Contexts Optical Compression
Open source semantic search and text analytics for large document sets
Crowdsourcing platform for full text transcription and tagging
OCR software, free and offline
A framework to enable multimodal models to operate a computer
A cross-platform software for text translation and recognition
OCRmyPDF adds an OCR text layer to scanned PDF files
Enhances Tesseract OCR output using LLMs (local or API)
Accurate × Fast × Comprehensive
A simple tool for reading in poorly redacted documents
Visual Causal Flow
OCR expert VLM powered by Hunyuan's native multimodal architecture
An on-premises, OCR-free unstructured data extraction
The media player for language learning, with dual subtitles
Assist in organizing your piles of documents
JavaScript OCR and text extraction for images and PDFs
A ranked list of awesome machine learning Python libraries
A Python application to add watermarks (text or image) to PDF files
Scan Tailor Experimental is an interactive post-processing tool
ITTT is a Free tool designed to Scan and extract Text from Images.
Command-line toolset for extracting text from files
Img2Txt - Extract Text From Images using AI