Showing 1486 open source projects for "document search engine"

View related business solutions
  • AI-generated apps that pass security review Icon
    AI-generated apps that pass security review

    Stop waiting on engineering. Build production-ready internal tools with AI—on your company data, in your cloud.

    Retool lets you generate dashboards, admin panels, and workflows directly on your data. Type something like “Build me a revenue dashboard on my Stripe data” and get a working app with security, permissions, and compliance built in from day one. Whether on our cloud or self-hosted, create the internal software your team needs without compromising enterprise standards or control.
    Try Retool free
  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • 1
    Search-Index

    Search-Index

    A persistent, network resilient, full text search library

    Search-Index is a lightweight and fast JavaScript-based search engine that enables full-text search indexing and retrieval for web applications.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    RATS Search

    RATS Search

    BitTorrent P2P multi-platform search engine for Desktop

    Rats Search is a cross-platform search tool for torrent indexing across multiple BitTorrent DHT networks. It provides a GUI for searching decentralized torrent metadata in real time without relying on centralized indexes. Built with Electron and Vue.js, Rats Search emphasizes decentralization and anonymity, allowing users to explore content from distributed sources such as the BitTorrent Mainline DHT and WebTorrent. It supports filtering, magnet link generation, and acts as a...
    Downloads: 104 This Week
    Last Update:
    See Project
  • 3
    Text Search Engine

    Text Search Engine

    A text search engine that supports mixed Chinese and English search

    Text-Search-Engine is a JavaScript-based lightweight search engine that enables full-text search functionality. It allows developers to implement fast search indexing and retrieval in web applications.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    Search with Lepton

    Search with Lepton

    Lightweight demo to build a conversational AI search engine quickly

    Search with Lepton is an open source demonstration project that shows how to build a conversational search engine using the Lepton AI framework. It combines traditional web search with large language models to provide natural language answers to user queries. It retrieves information from supported search engines and uses that context to generate responses through a retrieval-augmented generation approach.
    Downloads: 3 This Week
    Last Update:
    See Project
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 5
    Open Semantic Search

    Open Semantic Search

    Open source semantic search and text analytics for large document sets

    Open Semantic Search is an open source research and analytics platform designed for searching, analyzing, and exploring large collections of documents using semantic search technologies. It provides an integrated search server combined with a document processing pipeline that supports crawling, text extraction, and automated analysis of content from many different sources.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 6
    Whoogle Search

    Whoogle Search

    A self-hosted, ad-free, privacy-respecting metasearch engine

    Get Google search results, but without any ads, javascript, AMP links, cookies, or IP address tracking. Easily deployable in one click as a Docker app, and customizable with a single config file. Quick and simple to implement as a primary search engine replacement on both desktop and mobile. Autocomplete/search suggestions. POST request search and suggestion queries (when possible).
    Downloads: 10 This Week
    Last Update:
    See Project
  • 7
    GitHub search with Manticore Search

    GitHub search with Manticore Search

    Demo: GitHub search with Manticore Search

    GitHub search with Manticore Search. The Manticore GitHub Issue Search tool allows users to search through GitHub issues using Manticore Search, a powerful full-text search engine designed for large datasets and real-time processing. It integrates Manticore's capabilities with GitHub to offer fast and efficient searches within repositories.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    Wicked Engine

    Wicked Engine

    3D engine with modern graphics

    ...There are other example projects that you can build as well within the solution. If you want to develop a C++ application that uses Wicked Engine, you can build the WickedEngine static library project for the appropriate platform, such as WickedEngine_Windows, and link against it. Including the "WickedEngine.h" header will attempt to link the binaries for the appropriate platform, but search directories should be set up beforehand.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 9
    SemTools

    SemTools

    Semantic search and document parsing tools for the command line

    SemTools is an open-source command-line toolkit designed for document parsing, semantic indexing, and semantic search workflows. The project focuses on enabling developers and AI agents to process large document collections and extract meaningful semantic representations that can be searched efficiently. Built with Rust for performance and reliability, the toolchain provides fast processing of text and structured documents while maintaining low system overhead. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Gemini 3 and 200+ AI Models on One Platform Icon
    Gemini 3 and 200+ AI Models on One Platform

    Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

    Build generative AI apps with Vertex AI. Switch between models without switching platforms.
    Start Free
  • 10
    marqo

    marqo

    Tensor search for humans

    A tensor-based search and analytics engine that seamlessly integrates with your applications, websites, and workflows. Marqo is a versatile and robust search and analytics engine that can be integrated into any website or application. Due to horizontal scalability, Marqo provides lightning-fast query times, even with millions of documents. Marqo helps you configure deep-learning models like CLIP to pull semantic meaning from images.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 11
    SAG

    SAG

    SQL-Driven RAG Engine

    ...These vectors allow the system to identify relationships between concepts and construct a graph representation of knowledge at runtime. The architecture also includes a three-stage retrieval pipeline consisting of recall, expansion, and reranking steps to improve search accuracy. The engine integrates semantic vector similarity with traditional full-text search to improve both recall and precision. Because the knowledge graph is generated dynamically, the system can adapt to new information without requiring manual graph maintenance.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    theHarvester

    theHarvester

    E-mails, subdomains and names

    theHarvester is a very simple to use, yet powerful and effective tool designed to be used in the early stages of a penetration test or red team engagement. Use it for open source intelligence (OSINT) gathering to help determine a company's external threat landscape on the internet. The tool gathers emails, names, subdomains, IPs and URLs using multiple public data sources.
    Downloads: 49 This Week
    Last Update:
    See Project
  • 13
    MongoDB

    MongoDB

    The MongoDB Database

    MongoDB refers to the core MongoDB server, a modern, document-oriented NoSQL database offering flexible schema, rich queries, horizontal scalability, and integrated support for transactions and search. Packages are created dynamically by the buildscripts/packager.py script. This will generate RPM and Debian packages. Client drivers for most programming languages are available. You can install compass using the install_compass script packaged with MongoDB.
    Downloads: 85 This Week
    Last Update:
    See Project
  • 14
    Marten

    Marten

    .NET Transactional Document DB and Event Store on PostgreSQL

    The Marten library provides .NET developers with the ability to use the proven PostgreSQL database engine and its fantastic JSON support as a fully-fledged document database. The Marten team believes that a document database has far-reaching benefits for developer productivity over relational databases with or without an ORM tool. Marten also provides .NET developers with an ACID-compliant event store with user-defined projections against event streams.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 15
    HelixDB

    HelixDB

    Graph-vector database for building unified AI backends fast

    ...HelixDB includes built-in capabilities for embeddings, vector search, keyword search, and graph traversal, which are particularly useful for retrieval-augmented generation and agent-based systems.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 16
    Paperless-AI

    Paperless-AI

    AI-powered document analysis and tagging for Paperless-ngx

    ...A key capability is its use of retrieval-augmented generation, which enables semantic search and natural language interaction across an entire document archive. Users can ask contextual questions about their files and receive precise answers based on full document understanding rather than simple keyword matching. Paperless-AI also includes a web interface for manual review and tagging, allowing greater control when handling sensitive or complex documents.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 17
    WeKnora

    WeKnora

    LLM framework for document understanding and semantic retrieval

    ...This approach enables the system to provide more reliable answers by grounding model reasoning in the content of uploaded documents. WeKnora is designed with a modular architecture that separates components for document processing, search strategies, and model inference, allowing developers to customize or extend different parts of the pipeline. It supports knowledge base management and conversational question answering built on top of structured and unstructured documents.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 18
    Semantra

    Semantra

    Multi-tool for semantic search

    Semantra is an open-source semantic search tool designed to help users explore large collections of documents by meaning rather than simple keyword matching. The software analyzes text and PDF documents stored locally and creates embeddings that allow queries to retrieve results based on conceptual similarity. It is primarily intended for individuals who need to extract insights from large document collections, including researchers, journalists, students, and historians. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 19
    Anna’s Archive

    Anna’s Archive

    Comprehensive search engine for books, papers, comics, magazines

    Anna’s Archive is a large-scale open-source search engine and data aggregation platform designed to index and provide access to a vast collection of books, academic papers, comics, magazines, and other digital texts through a unified interface. The project includes all the infrastructure required to run a full instance locally or in production, combining web servers, databases, and search indexing systems into a scalable architecture.
    Downloads: 35 This Week
    Last Update:
    See Project
  • 20
    paperless-gpt

    paperless-gpt

    Use LLMs and LLM Vision (OCR) to handle paperless-ngx

    paperless-gpt is an AI-powered extension for document management systems that enhances the capabilities of paperless-ngx by integrating large language models and vision-based OCR to automate document processing and organization. It is designed to transform scanned or uploaded documents into structured, searchable, and intelligently categorized data without requiring manual tagging or sorting.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 21
    OWL

    OWL

    Optimized Workforce Learning for General Multi-Agent Assistance

    ...Unlike single-agent systems, it treats task completion as a collaborative workforce where agents take on specialized roles (planning, execution, analysis) and coordinate via a modular multi-agent architecture that supports flexible teamwork across domains. OWL delivers state-of-the-art performance on benchmarks like GAIA and emphasizes real-time decision-making, web automation, rich search integration, document parsing, and multi-tool workflows, making it suitable for tasks ranging from information retrieval to interactive automation.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    Notepad++

    Notepad++

    Free, open-source text editor

    Notepad++ is a source code editor that is free to use and is available in various languages. The source code editor is also written in C++ and is based on the Scintilla editing component. Notepad++ offers a wide range of features, such as autosaving, line bookmarking, simultaneous editing, tabbed document interface, and many more features. Over 140 plugins are also available to use in the default program. Notepad++ takes advantage of higher execution speed and smaller program size by...
    Downloads: 2,622 This Week
    Last Update:
    See Project
  • 23
    qBittorrent RuTracker plugin

    qBittorrent RuTracker plugin

    qBittorrent search engine plugin for rutracker

    qBittorrent RuTracker plugin is a lightweight search engine extension designed to integrate the RuTracker torrent index directly into the qBittorrent client, allowing users to search for torrents without leaving the application interface. The plugin follows qBittorrent’s official search plugin architecture and is implemented as a Python script that communicates with the RuTracker website to retrieve and display search results.
    Downloads: 24 This Week
    Last Update:
    See Project
  • 24
    SiteDorks

    SiteDorks

    Automate search engine dorking across hundreds of websites

    SiteDorks is a command line tool designed to automate advanced search queries across multiple search engines and websites. It allows users to perform search engine “dork” queries against a large set of predefined domains, making it easier to discover publicly available information across different platforms. SiteDorks supports several major search engines including Google, Bing, Brave, Ecosia, DuckDuckGo, Yahoo, and Yandex.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 25
    Papermerge

    Papermerge

    Open Source Document Management System for Digital Archives

    Papermerge is an open source document management system (DMS) primarily designed for archiving and retrieving your digital documents. Instead of having piles of paper documents all over your desk, office or drawers - you can quickly scan them and configure your scanner to directly upload to Papermerge DMS. Store, organize and index scanned documents in PDF, JPEG and TIFF formats.
    Downloads: 15 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • 4
  • 5
  • Next
MongoDB Logo MongoDB