Open source libraries and APIs to build custom preprocessing pipelines
Instill Core is a full-stack AI infrastructure tool for data
Superlinked is a Python framework for AI Engineers
AI-Powered Data Processing: Use LOTUS to process all of your datasets
Extract schema, statistics and entities from datasets
Central interface to connect your LLM's with external data
Parse files for optimal RAG
Autonomous LLM agent for end-to-end data science workflows
A fast, helpful, and open-source document parser
Context database designed specifically for AI Agents
The open source mesh processing system
Vector database for scalable similarity search and AI applications
No-code LLM Platform to launch APIs and ETL Pipelines
Clean network diagrams, One-time setup, zero upkeep
Python module for parsing semi-structured text into python tables
RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine
A system for agentic LLM-powered data processing and ETL
AI-data warehouse to enrich, transform and analyze unstructured data
A modular graph-based Retrieval-Augmented Generation (RAG) system
Airweave lets agents search any app
CrateDB is a distributed and scalable SQL database
Fluentd: Unified Logging Layer (project under CNCF)
Open source web scraping system for automated data collection tasks
Web framework designed for speed, security, and SEO
Fast and efficient unstructured data extraction