Showing 283 open source projects for "information retrieval"

View related business solutions
  • Gemini 3 and 200+ AI Models on One Platform Icon
    Gemini 3 and 200+ AI Models on One Platform

    Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

    Build generative AI apps with Vertex AI. Switch between models without switching platforms.
    Start Free
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 1
    BEIR

    BEIR

    A Heterogeneous Benchmark for Information Retrieval

    BEIR is a benchmark framework for evaluating information retrieval models across various datasets and tasks, including document ranking and question answering.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    RAPTOR

    RAPTOR

    The official implementation of RAPTOR

    RAPTOR is a retrieval architecture designed to improve retrieval-augmented generation systems by organizing documents into hierarchical structures that enable more effective context retrieval. Traditional RAG systems typically retrieve small text chunks independently, which can limit a model’s ability to understand broader document context. RAPTOR addresses this limitation by recursively embedding, clustering, and summarizing documents to create a tree-structured hierarchy of information. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    FastRAG

    FastRAG

    Efficient Retrieval Augmentation and Generation Framework

    fastRAG is a research framework for efficient and optimized retrieval augmented generative pipelines, incorporating state-of-the-art LLMs and Information Retrieval. fastRAG is designed to empower researchers and developers with a comprehensive tool set for advancing retrieval augmented generation.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    LightRAG

    LightRAG

    "LightRAG: Simple and Fast Retrieval-Augmented Generation"

    LightRAG is a lightweight Retrieval-Augmented Generation (RAG) framework designed for efficient document retrieval and response generation. It is optimized for speed and lower resource consumption, making it ideal for real-time applications.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Try Google Cloud Risk-Free With $300 in Credit Icon
    Try Google Cloud Risk-Free With $300 in Credit

    No hidden charges. No surprise bills. Cancel anytime.

    Use your credit across every product. Compute, storage, AI, analytics. When it runs out, 20+ products stay free. You only pay when you choose to.
    Start Free
  • 5
    Youtu-GraphRAG

    Youtu-GraphRAG

    Vertically Unified Agents for Graph Retrieval-Augmented Reasoning

    Youtu-GraphRAG is a research framework developed by Tencent for performing complex reasoning using graph-based retrieval-augmented generation. The system combines knowledge graphs, retrieval mechanisms, and agent-based reasoning into a unified architecture designed to handle knowledge-intensive tasks. Instead of relying solely on text retrieval, the framework organizes information into structured graph schemas that represent entities, relationships, and attributes. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    WeKnora

    WeKnora

    LLM framework for document understanding and semantic retrieval

    WeKnora is an open source framework developed for deep document understanding and semantic information retrieval using large language models. It focuses on analyzing complex and heterogeneous documents by combining multiple processing stages such as multimodal document parsing, vector indexing, and intelligent retrieval. It follows the Retrieval-Augmented Generation (RAG) paradigm, where relevant document segments are retrieved and used by language models to generate accurate, context-aware responses. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 7
    FlagEmbedding

    FlagEmbedding

    Retrieval and Retrieval-augmented LLMs

    FlagEmbedding is an open-source toolkit for building and deploying high-performance text embedding models used in information retrieval and retrieval-augmented generation systems. The project is part of the BAAI FlagOpen ecosystem and focuses on creating embedding models that transform text into dense vector representations suitable for semantic search and large language model pipelines. FlagEmbedding includes a family of models known as BGE (BAAI General Embedding), which are designed to achieve strong performance across multilingual and cross-lingual retrieval benchmarks. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    Agentic RAG for Dummies

    Agentic RAG for Dummies

    A modular Agentic RAG built with LangGraph

    Agentic RAG for Dummies is an educational repository that demonstrates how to build retrieval-augmented generation systems combined with autonomous AI agents. The project explains the principles behind agentic retrieval pipelines where language models can dynamically decide when to retrieve information, analyze results, and plan further actions. Instead of relying on static retrieval pipelines, the system shows how agents can orchestrate retrieval, reasoning, and tool usage in a more flexible decision loop. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    Dynamiq

    Dynamiq

    An orchestration framework for agentic AI and LLM applications

    Dynamiq is an open-source orchestration framework designed to streamline the development of generative AI applications that rely on large language models and autonomous agents. The framework focuses on simplifying the creation of complex AI workflows that involve multiple agents, retrieval systems, and reasoning steps. Instead of building each component manually, developers can use Dynamiq’s structured APIs and modular architecture to connect language models, vector databases, and external tools into cohesive pipelines. The framework supports the creation of multi-agent systems where different AI agents collaborate to solve tasks such as information retrieval, document analysis, or automated decision making. ...
    Downloads: 3 This Week
    Last Update:
    See Project
  • Full-stack observability with actually useful AI | Grafana Cloud Icon
    Full-stack observability with actually useful AI | Grafana Cloud

    Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.

    Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.
    Create free account
  • 10
    SAG

    SAG

    SQL-Driven RAG Engine

    ...These vectors allow the system to identify relationships between concepts and construct a graph representation of knowledge at runtime. The architecture also includes a three-stage retrieval pipeline consisting of recall, expansion, and reranking steps to improve search accuracy. The engine integrates semantic vector similarity with traditional full-text search to improve both recall and precision. Because the knowledge graph is generated dynamically, the system can adapt to new information without requiring manual graph maintenance.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    Erethos Downloader

    Erethos Downloader

    Save Erothots content including leaked videos and premium collections

    Erethos Downloader is an automation tool designed to download media content from supported adult content platforms, enabling users to archive videos and images locally for offline access. The project focuses on simplifying the retrieval process by allowing users to input URLs or account information and automatically fetch associated content in bulk. It includes mechanisms for handling authentication and session data, ensuring access to content that may require login credentials. The downloader is designed to manage large batches efficiently, reducing manual effort when saving multiple posts or galleries. ...
    Downloads: 37 This Week
    Last Update:
    See Project
  • 12
    QuivrHQ

    QuivrHQ

    Opiniated RAG for integrating GenAI in your apps

    Quivr is an open-source platform that leverages Retrieval-Augmented Generation (RAG) to integrate Generative AI into applications. It serves as a "second brain," enabling users to build powerful AI-driven assistants that can process and retrieve information efficiently. Quivr supports various large language models and vector stores, providing flexibility and customization for developers.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    Qwen3-VL-Embedding

    Qwen3-VL-Embedding

    Multimodal embedding and reranking models built on Qwen3-VL

    ...The core embedding model maps such inputs into semantically rich vectors in a unified representation space, enabling similarity search, clustering, and cross-modal retrieval. The reranking model then precisely scores relevance between a given query and candidate documents, enhancing retrieval accuracy in complex multimodal tasks. Together, they support advanced information retrieval workflows such as image-text search, visual question answering (VQA), and video-text matching, while providing out-of-the-box support for more than 30 languages.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    MCP Server RAG Web Browser

    MCP Server RAG Web Browser

    A MCP Server for the RAG Web Browser Actor

    The MCP Server for the RAG Web Browser Actor allows AI assistants and LLMs to perform web searches and extract information from web pages. It facilitates interaction with the web, enabling up-to-date context retrieval for AI applications. ​
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    Supermemory

    Supermemory

    Memory engine and app that is extremely fast, scalable

    Supermemory is an ambitious and extensible AI-powered personal knowledge management system that aims to help users capture, organize, retrieve, and reason over information in a manner that mimics human memory structures. The platform allows individuals to ingest text, documents, and other content forms, then uses advanced retrieval and embedding techniques to index and relate information intelligently so that users can recall relevant knowledge in context rather than just by keyword match. It often incorporates clustering, semantic search, and summarization modules to reduce cognitive load and surface key ideas, which makes it useful for research, study, writing, and long-term project tracking. ...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 16
    WebGLM

    WebGLM

    An Efficient Web-enhanced Question Answering System

    WebGLM is a web-enhanced question-answering system that combines a large language model with web search and retrieval capabilities to produce more accurate answers. The system is based on the General Language Model architecture and was designed to enable language models to interact directly with web information during the question-answering process. Instead of relying solely on knowledge stored in the model’s training data, the system retrieves relevant web content and integrates it into the reasoning process. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    Kernel Memory

    Kernel Memory

    Research project. A Memory solution for users, teams, and applications

    Kernel Memory is an open-source reference architecture developed by Microsoft to help developers build memory systems for AI applications powered by large language models. The project focuses on enabling applications to store, index, and retrieve information so that AI systems can incorporate external knowledge when generating responses. It supports scenarios such as document ingestion, semantic search, and retrieval-augmented generation, allowing language models to answer questions using contextual information from private or enterprise datasets. Kernel Memory can ingest documents in multiple formats, process them into embeddings, and store them in searchable indexes. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    OpenMemory

    OpenMemory

    Local long-term memory engine for AI apps with persistent storage

    OpenMemory is a self-hosted memory engine designed to provide long-term, persistent storage for AI and LLM-powered applications. It enables developers to give otherwise stateless models a structured memory layer that can store, retrieve, and manage contextual information over time. OpenMemory is built around a hierarchical memory architecture that organizes data into semantic sectors and connects them through a graph-based structure for efficient retrieval. It supports multiple embedding strategies, including synthetic and semantic embeddings, allowing developers to balance speed and accuracy depending on their use case. ...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 19
    Scira

    Scira

    AI-powered search engine that helps you find information

    Scira is an open source AI-powered search and research assistant designed to provide fast, conversational answers grounded in web and knowledge sources. The project combines a modern web interface with retrieval-augmented generation techniques to deliver responses that are both natural language friendly and evidence oriented. It is built for developers who want to deploy their own Perplexity-style or AI search experience without relying on proprietary hosted services. Scira emphasizes speed,...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 20
    gensim

    gensim

    Topic Modelling for Humans

    Gensim is a Python library for topic modeling, document indexing, and similarity retrieval with large corpora. The target audience is the natural language processing (NLP) and information retrieval (IR) community.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    MemoryOS

    MemoryOS

    MemoryOS is designed to provide a memory operating system

    MemoryOS is an open-source framework designed to provide a structured memory management system for AI agents and large language model applications. The project addresses one of the major limitations of modern language models: their inability to maintain long-term context beyond the limits of their prompt window. MemoryOS introduces a hierarchical memory architecture inspired by operating system memory management principles, allowing agents to store, update, retrieve, and generate information...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    MCP ZoomEye

    MCP ZoomEye

    A Model Context Protocol server that provides network asset info

    The ZoomEye MCP Server is a Model Context Protocol server that provides network asset information based on query conditions, allowing Large Language Models to obtain data by querying ZoomEye using dorks and other search parameters. ​
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    ChatWiki

    ChatWiki

    ChatWiki WeChat official account's AI knowledge base workflow agent

    ChatWiki is an open-source AI knowledge base and workflow automation platform designed to help organizations build intelligent question-answering systems using large language models and retrieval-augmented generation techniques. The system enables companies to transform internal documents and data into searchable knowledge bases that can power AI assistants capable of answering domain-specific questions. It provides a complete pipeline for ingesting documents, preprocessing and segmenting content, generating vector embeddings, and retrieving relevant information during conversations. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    Google Research: Language

    Google Research: Language

    Shared repository for open-sourced projects from the Google AI Lang

    ...Many of the projects included in the repository correspond to research papers released by Google researchers and provide implementations of new NLP algorithms or experimental frameworks. These implementations often explore advanced techniques such as language modeling, semantic understanding, information retrieval, and multilingual text processing. The repository functions as a collaborative hub where different research initiatives can publish their code, enabling the broader community to reproduce experiments and build upon published work.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    pyAudioAnalysis

    pyAudioAnalysis

    Python Audio Analysis Library: Feature Extraction, Classification

    pyAudioAnalysis is an open-source Python library designed for audio signal analysis, machine learning, and music information retrieval tasks. The project provides a collection of tools that allow developers to extract meaningful features from audio files and use those features for classification, segmentation, and analysis. The library supports multiple audio processing workflows, including feature extraction from raw audio signals, training of machine learning models, and automatic audio segmentation. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • 4
  • 5
  • Next
MongoDB Logo MongoDB