Contexts Optical Compression
Accurate × Fast × Comprehensive
PDF to Markdown with vision models
Visual Causal Flow
Convert AI papers to GUI
Awesome multilingual OCR toolkits based on PaddlePaddle
Enhances Tesseract OCR output using LLMs (local or API)
A framework to enable multimodal models to operate a computer
In-depth tutorials on LLMs, RAGs and real-world AI agent applications
Use LLMs and LLM Vision (OCR) to handle paperless-ngx
Screenshots, word marking, OCR, AI, translation software
PDF scientific paper translation with preserved formats
OCR expert VLM powered by Hunyuan's native multimodal architecture
PDF Parser for AI-ready data. Automate PDF accessibility
Get your documents ready for gen AI
Self-hosted AI accounting app. LLM analyzer for receipts
Readest is a modern, feature-rich ebook reader
OpenRecall is a fully open-source, privacy-first alternative
Qwen3-VL, the multimodal large language model series by Alibaba Cloud
A Repo For Document AI
A simple tool for reading in poorly redacted documents
Document content and metadata extraction microservice
Doctor Dok is an AI based medical data framework
A self-hostable bookmark-everything app
Deep Learning API and Server in C++14 support for Caffe, PyTorch