dude uncomplicated data extraction: A simple framework
ExtractThinker is a Document Intelligence library for LLMs
CLI tool to extract (meta)data from PDF and manipulate PDF files
Did you say you like data?
Structured data extraction and instruction calling with ML, LLM
Turn entire websites into LLM-ready markdown or structured data
No-code LLM Platform to launch APIs and ETL Pipelines
Clean network diagrams, One-time setup, zero upkeep
A high-quality tool for convert PDF to Markdown and JSON
MD/.JSON Document OCR and structured data extraction API
Flexible Node.js AI-assisted crawler library
Fast and efficient unstructured data extraction
Model Context Protocol server that integrates AgentQL's data
Unreal Engine Archives Explorer
Make websites accessible for AI agents
ContextGem: Effortless LLM extraction from documents
Automatic extraction of relevant features from time series
AI-ready web crawler that extracts and structures website content
Extract and convert data from any document, images, pdfs, word doc
BlockArrays for Julia
AI-first Ruby framework for building fast, flexible web scraping spide
Open source web scraping system for automated data collection tasks
A library for audio and music analysis, feature extraction
Synthetic data curation for post-training and data extraction
Eases DOM navigation for HTML and XML documents