Browse free open source Java Search Engines and projects below. Use the toggles on the left to filter open source Java Search Engines by OS, license, language, programming language, and project status.

  • Level Up Your Cyber Defense with External Threat Management Icon
    Level Up Your Cyber Defense with External Threat Management

    See every risk before it hits. From exposed data to dark web chatter. All in one unified view.

    Move beyond alerts. Gain full visibility, context, and control over your external attack surface to stay ahead of every threat.
    Try for Free
  • The All-in-One Commerce Platform for Businesses - Shopify Icon
    The All-in-One Commerce Platform for Businesses - Shopify

    Shopify offers plans for anyone that wants to sell products online and build an ecommerce store, small to mid-sized businesses as well as enterprise

    Shopify is a leading all-in-one commerce platform that enables businesses to start, build, and grow their online and physical stores. It offers tools to create customized websites, manage inventory, process payments, and sell across multiple channels including online, in-person, wholesale, and global markets. The platform includes integrated marketing tools, analytics, and customer engagement features to help merchants reach and retain customers. Shopify supports thousands of third-party apps and offers developer-friendly APIs for custom solutions. With world-class checkout technology, Shopify powers over 150 million high-intent shoppers worldwide. Its reliable, scalable infrastructure ensures fast performance and seamless operations at any business size.
    Learn More
  • 1
    Hibernate

    Hibernate

    An object relational-mapping (ORM) library for Java

    Hibernate is an Object/Relational Mapper tool. It's very popular among Java applications and implements the Java Persistence API. Hibernate ORM enables developers to more easily write applications whose data outlives the application process. As an Object/Relational Mapping (ORM) framework, Hibernate is concerned with data persistence as it applies to relational databases (via JDBC).
    Leader badge
    Downloads: 594 This Week
    Last Update:
    See Project
  • 2
    Greenstone

    Greenstone

    Digital Library Software

    Greenstone is a complete digital library creation, management and distribution package created and distributed by the New Zealand Digital Library Project. There are two major versions of the software. Greenstone 3 is under active development, and is recommended for download. We also provide maintenance releases for its forerunner, Greenstone 2. Featured download not what you're looking for? Click "Browse all files" to access binaries and source releases of both versions.
    Leader badge
    Downloads: 386 This Week
    Last Update:
    See Project
  • 3
    Elasticsearch

    Elasticsearch

    A Distributed RESTful Search Engine

    Elasticsearch is a distributed, RESTful search and analytics engine that lets you store, search and analyze with ease at scale. It lets you perform and combine many types of searches; it scales seamlessly, and offers answers incredibly fast with search results you can rank based on a variety of factors. Elasticsearch can be used for a wide variety of use cases, from maps and metrics to site search and workplace search, and with all data types.
    Downloads: 13 This Week
    Last Update:
    See Project
  • 4
    WebHarvest - web data extraction tool
    Web data extraction (web data mining, web scraping) tool. It leverages well proved XML and text processing techologies in order to easely extract useful data from arbitrary web pages.
    Downloads: 43 This Week
    Last Update:
    See Project
  • Gen AI apps are built with MongoDB Atlas Icon
    Gen AI apps are built with MongoDB Atlas

    The database for AI-powered applications.

    MongoDB Atlas is the developer-friendly database used to build, scale, and run gen AI and LLM-powered apps—without needing a separate vector database. Atlas offers built-in vector search, global availability across 115+ regions, and flexible document modeling. Start building AI apps faster, all in one place.
    Start Free
  • 5
    NekoHTML is a simple HTML scanner and tag balancer that enables application programmers to parse HTML documents and access the information using standard XML interfaces.
    Downloads: 21 This Week
    Last Update:
    See Project
  • 6
    Regain is a Java search engine based on Jakarta Lucene. It provides indexing and searching files for plenty of formats (HTML,XML,doc(x),xls(x),ppt(x),oo,PDF,RTF,mp3,mp4,Java). A TagLibrary eases integrating search results in your JSP based web page.
    Downloads: 15 This Week
    Last Update:
    See Project
  • 7
    The Lemur Project

    The Lemur Project

    Search engine and data mining applications and ClueWeb datasets.

    The Lemur Project develops search engines, browser toolbars, text analysis tools, and data resources that support research and development of information retrieval and text mining software, including the Indri search engine in C++, the Galago search engine research framework in Java, the RankLib learning to rank library, ClueWeb09 and ClueWeb12 datasets and the Sifaka data mining application.
    Downloads: 11 This Week
    Last Update:
    See Project
  • 8
    ResCarta

    ResCarta

    Archive your personal history

    ResCarta Toolkit offers an open source solution to creating, storing, viewing, and searching digital collections. Applications in the toolkit let users create and edit metadata, convert data to open standard ResCarta format, index and host collections.
    Leader badge
    Downloads: 11 This Week
    Last Update:
    See Project
  • 9
    OpenSearchServer Search Engine

    OpenSearchServer Search Engine

    An open source search engine with RESTFul API and crawlers

    OpenSearchServer is a powerful, enterprise-class, search engine program. Using the web user interface, the crawlers (web, file, database, etc.) and the client libraries (REST/API , Ruby, Rails, Node.js, PHP, Perl) you will be able to integrate quickly and easily advanced full-text search capabilities in your application: Full-text with basic semantic, join queries, boolean queries, facet and filter, document (PDF, Office, etc.) indexation, web scrapping,etc. OpenSearchServer runs on Windows and Linux/Unix/BSD.
    Downloads: 7 This Week
    Last Update:
    See Project
  • Simple, Secure Domain Registration Icon
    Simple, Secure Domain Registration

    Get your domain at wholesale price. Cloudflare offers simple, secure registration with no markups, plus free DNS, CDN, and SSL integration.

    Register or renew your domain and pay only what we pay. No markups, hidden fees, or surprise add-ons. Choose from over 400 TLDs (.com, .ai, .dev). Every domain is integrated with Cloudflare's industry-leading DNS, CDN, and free SSL to make your site faster and more secure. Simple, secure, at-cost domain registration.
    Sign up for free
  • 10
    A torrent search engine plugin for the Azureus/Vuze bittorrent platform.
    Downloads: 27 This Week
    Last Update:
    See Project
  • 11
    OpenSearch

    OpenSearch

    Open source distributed and RESTful search engine

    OpenSearch is a distributed search and analytics engine based on Apache Lucene. After adding your data to OpenSearch, you can perform full-text searches on it with all of the features you might expect: search by field, search multiple indices, boost fields, rank results by score, sort results by field, and aggregate results. Unsurprisingly, people often use search engines like OpenSearch as the backend for a search application, think Wikipedia or an online store. It offers excellent performance and can scale up and down as the needs of the application grow or shrink. Its distributed design means that you interact with OpenSearch clusters. Each cluster is a collection of one or more nodes, servers that store your data and process search requests. You can run OpenSearch locally on a laptop, its system requirements are minimal, but you can also scale a single cluster to hundreds of powerful machines in a data center.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 12
    Geoportal Server
    Geoportal Server is a standards-based, open source product that enables discovery and use of geospatial resources including data and services.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 13
    PDFBox is a Java PDF Library. This project will allow access to all of the components in a PDF document. More PDF manipulation features will be added as the project matures. This ships with a utility to take a PDF document and output a text file.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 14
    PHPDynaSite is a free Content Management System written in PHP/MySQL/Java applets. It provides a lot of features such as image resizing, Spreadsheet and richtext editor, logs, ...
    Downloads: 13 This Week
    Last Update:
    See Project
  • 15
    cpDetector is a proxy for codepage detection of documents. It delegates to multiple instances that try to detect the codepage by different techinques. A command line executeable is shipped that allows to sort documents by codepage.
    Downloads: 10 This Week
    Last Update:
    See Project
  • 16
    Aperture is a Java framework for extracting and querying full-text content and metadata from various information systems (file systems, web sites, mail boxes, ...) and the file formats (documents, images, ...) occurring in these systems.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 17
    Hyper Estraier is a full-text search system. It works as with Google, but based on peer-to-peer architecture. Using Hyper Estraier, we can construct a large-scaled search engine with cheap computers.
    Leader badge
    Downloads: 9 This Week
    Last Update:
    See Project
  • 18
    This is an ***old archive*** of tools developed for facilitating the use of Creative Commons licenses and metadata. --- For the most up to date representation of any of the projects listed here, please see: http://creativecommons.org/project/Developer.
    Downloads: 8 This Week
    Last Update:
    See Project
  • 19
    webExtractor is a Java application that is used for extracting specific content from web based HTML, XML, CSV, and free form text. The extracted data can be used for data gathering and mining purposes.
    Downloads: 8 This Week
    Last Update:
    See Project
  • 20
    Contineo is a Web-based Document Management System (DMS). Features: Folder organization, document Versioning, Bulk import, import from mailbox. NOTE: this project has been DISMISSED in favor of LogicalDOC http://sourceforge.net/projects/logicaldoc
    Downloads: 4 This Week
    Last Update:
    See Project
  • 21
    Command line application written in Java useful for automation of downloading process and filtering contents of downloaded files. jDownloader uses simple script file to configure downloading and filtering processes.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 22
    A universal platform for resource discovery and description that shares XML meta-data over existing peer-to-peer (P2P) networks such as Gnutella and JXTA.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 23
    YaCy Peer-to-Peer Search Engine

    YaCy Peer-to-Peer Search Engine

    Decentralized Web Search Engine

    YaCy is a free search engine that anyone can use to build search the internet (www and ftp) or to create a search portal for others (internet or intranet). The scale of YaCy is limited only by the number of users and can index billions of web pages. In p2p mode it is fully decentralized, all users of the search engine network are equal and it is not possible for anyone to censor the content of the distributed index.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 24

    Smart Cache Loader

    Very configurable web downloader

    Smart Cache Loader is a very configurable pure Java web grabber with special support for integration with Smart Cache proxy server. It can perform different loading operations based on URL mask, content-type, ...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 25
    Carrot2
    Project moved to GitHub! https://github.com/carrot2/carrot2 Carrot2 is an Open Source Search Results Clustering Engine. It can automatically organize small collections of documents, e.g. search results, into thematic categories. Carrot2 integrates very well with both Open Source and proprietary search engines.
    Leader badge
    Downloads: 2 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • 4
  • 5
  • Next
Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.