Open Source OCR Engine
Awesome multilingual OCR toolkits based on PaddlePaddle
Contexts Optical Compression
OCR software, free and offline
Open source semantic search and text analytics for large document sets
Crowdsourcing platform for full text transcription and tagging
A framework to enable multimodal models to operate a computer
Enhances Tesseract OCR output using LLMs (local or API)
Audio foundation model excelling in audio understanding
OCRmyPDF adds an OCR text layer to scanned PDF files
A cross-platform software for text translation and recognition
Accurate × Fast × Comprehensive
A library for audio and music analysis, feature extraction
OCR expert VLM powered by Hunyuan's native multimodal architecture
Visual Causal Flow
A simple tool for reading in poorly redacted documents
The media player for language learning, with dual subtitles
Python Audio Analysis Library: Feature Extraction, Classification
An on-premises, OCR-free unstructured data extraction
A ranked list of awesome machine learning Python libraries
Assist in organizing your piles of documents
JavaScript OCR and text extraction for images and PDFs
A Web UI for easy subtitle using whisper model
Scan Tailor Experimental is an interactive post-processing tool