185 Integrations with Hadoop
View a list of Hadoop integrations and software that integrates with Hadoop below. Compare the best Hadoop integrations as well as features, ratings, user reviews, and pricing of software that integrates with Hadoop. Here are the current Hadoop integrations in 2026:
-
1
ActiveBatch Workload Automation
ActiveBatch by Redwood
ActiveBatch by Redwood makes setting up and launching automation easy with no custom scripting required. With a low-code Super REST API adapter, over 100 pre-built job steps and a user-friendly drag-and-drop workflow designer, you can integrate across any system, application and data source, on-prem, in the cloud or in hybrid environments. Maintain complete control and visibility and meet SLAs with monitoring of all automation from a single pane of glass and get custom alerts via emails or SMS. Managed Smart Queues dynamically scale resources for high-volume workloads, reducing process times while the self-service portal enables business users to run and monitor workflows independently. ActiveBatch meets security and compliance standards, with ISO 27001 and SOC 2, Type II certifications, encrypted connections and regular third-party tests, always keeping security at the forefront. Along with ongoing product advancements, get the added benefit of 24x7 support and on-site training. -
2
AnalyticsCreator
AnalyticsCreator
AnalyticsCreator is a metadata-driven data warehouse automation solution built specifically for teams working within the Microsoft data ecosystem. It helps organizations speed up the delivery of production-ready data products by automating the entire data engineering lifecycle—from ELT pipeline generation and dimensional modeling to historization and semantic model creation for platforms like Microsoft SQL Server, Azure Synapse Analytics, and Microsoft Fabric. By eliminating repetitive manual coding and reducing the need for multiple disconnected tools, AnalyticsCreator helps data teams reduce tool sprawl and enforce consistent modeling standards across projects. The solution includes built-in support for automated documentation, lineage tracking, schema evolution, and CI/CD integration with Azure DevOps and GitHub. Whether you’re working on data marts, data products, or full-scale enterprise data warehouses, AnalyticsCreator allows you to build faster, govern better, and deliver -
3
Pandora FMS
Pandora FMS
With more than 50,000 customer installations across the five continents, Pandora FMS is a truly all-in-one monitoring solution, covering all traditional silos for specific monitoring: servers, networks, applications, logs, synthetic/transactional, remote control, inventory, etc. Pandora FMS gives you the agility to find and solve problems quickly, scaling them so they can be derived from any source, on-premise, multi cloud or both of them mixed. Now you have that capability across your entire IT stack and analytics to find any problem, even the ones that are hard to find. Thanks to more than 500 plugins available, you can control and manage any application and technology, from SAP, Oracle, Lotus, Citrix or Jboss to VMware, AWS, SQL Server, Redhat, Websphere, etc.Starting Price: €90/month -
4
Composable DataOps Platform
Composable Analytics
Composable is an enterprise-grade DataOps platform built for business users that want to architect data intelligence solutions and deliver operational data-driven products leveraging disparate data sources, live feeds, and event data regardless of the format or structure of the data. With a modern, intuitive dataflow visual designer, built-in services to facilitate data engineering, and a composable architecture that enables abstraction and integration of any software or analytical approach, Composable is the leading integrated development environment to discover, manage, transform and analyze enterprise data.Starting Price: $8/hr - pay-as-you-go -
5
Peekdata
Peekdata
Consume data from any database, organize it into consistent metrics, and use it with every app. Build your Data and Reporting APIs faster with automated SQL generation, query optimization, access control, consistent metrics definitions, and API design. It takes only days to wrap any data source with a single reference Data API and simplify access to reporting and analytics data across your teams. Make it easy for data engineers and application developers to access the data from any source in a streamlined manner. - The single schema-less Data API endpoint - Review and configure metrics and dimensions in one place via UI - Data model visualization to make faster decisions - Data Export management scheduling AP Ready-to-use Report Builder and JavaScript components for charting libraries (Highcharts, BizCharts, Chart.js, etc.) makes it easy to embed data-rich functionality into your products. And you will not have to make custom report queries anymore!Starting Price: $349 per month -
6
Zuar Runner
Zuar, Inc.
Utilizing the data that's spread across your organization shouldn't be so difficult! With Zuar Runner you can automate the flow of data from hundreds of potential sources into a single destination. Collect, transform, model, warehouse, report, monitor and distribute: it's all managed by Zuar Runner. Pull data from Amazon/AWS products, Google products, Microsoft products, Avionte, Backblaze, BioTrackTHC, Box, Centro, Citrix, Coupa, DigitalOcean, Dropbox, CSV, Eventbrite, Facebook Ads, FTP, Firebase, Fullstory, GitHub, Hadoop, Hubic, Hubspot, IMAP, Jenzabar, Jira, JSON, Koofr, LeafLogix, Mailchimp, MariaDB, Marketo, MEGA, Metrc, OneDrive, MongoDB, MySQL, Netsuite, OpenDrive, Oracle, Paycom, pCloud, Pipedrive, PostgreSQL, put.io, Quickbooks, RingCentral, Salesforce, Seafile, Shopify, Skybox, Snowflake, Sugar CRM, SugarSync, Tableau, Tamarac, Tardigrade, Treez, Wurk, XML Tables, Yandex Disk, Zendesk, Zoho, and more! -
7
Kyvos Semantic Layer
Kyvos Insights
Kyvos is a semantic layer for AI and BI. It gives enterprises a single, consistent, business-friendly view of their data for trusted AI and BI — eliminating metric drift across BI tools, and grounding AI in governed semantic context for higher accuracy. Kyvos delivers lightning-fast analytics at massive scale and high concurrency, including richer multidimensional analytics on the cloud, while helping organizations control costs without performance trade-offs. * One unified semantic foundation * Zero metric drift, highest AI accuracy * 1000x faster analytics at scale * 50% cloud cost savings Kyvos unifies fragmented enterprise data into one consistent, trusted view and standardizes how it is defined, interpreted, and used — across dashboards, chatbots, and AI agents. -
8
Netdata
Netdata, Inc.
The open-source observability platform everyone needs! Netdata collects metrics per second and presents them in beautiful low-latency dashboards. It is designed to run on all of your physical and virtual servers, cloud deployments, Kubernetes clusters, and edge/IoT devices, to monitor your systems, containers, and applications. It scales nicely from just a single server to thousands of servers, even in complex multi/mixed/hybrid cloud environments, and given enough disk space it can keep your metrics for years. KEY FEATURES: 💥 Collects metrics from 800+ integrations 💪 Real-Time, Low-Latency, High-Resolution 😶🌫️ Unsupervised Anomaly Detection 🔥 Powerful Visualization 🔔 Out of box Alerts 📖 systemd Journal Logs Explorer 😎 Low Maintenance ⭐ Open and Extensible Try Netdata today and feel the pulse of your infrastructure, with high-resolution metrics, journal logs and real-time visualizations.Starting Price: Free -
9
MongoDB
MongoDB
MongoDB is a general purpose, document-based, distributed database built for modern application developers and for the cloud era. No database is more productive to use. Ship and iterate 3–5x faster with our flexible document data model and a unified query interface for any use case. Whether it’s your first customer or 20 million users around the world, meet your performance SLAs in any environment. Easily ensure high availability, protect data integrity, and meet the security and compliance standards for your mission-critical workloads. An integrated suite of cloud database services that allow you to address a wide variety of use cases, from transactional to analytical, from search to data visualizations. Launch secure mobile apps with native, edge-to-cloud sync and automatic conflict resolution. Run MongoDB anywhere, from your laptop to your data center.Starting Price: Free -
10
Flex83
IoT83
Re-imagine IoT innovation with the Flex83 Application Enablement Platform! Build compelling & powerful IoT solutions up to 80% faster & at a fraction of the cost. - Use no-code workflows to build professional-grade connect/monitor/analyze/manage solutions fast. - Use low-code tools to connect to virtually anything, add custom business logic, build analytics, custom dashboards, and launch multiple applications. - Use the hassle-free SaaS model to build & prove your solution – and then scale - using a "pay as you grow" model! You can create sophisticated IoT applications - literally - in a day with tools & workflows that give you the agility to build what your business or customers need without worrying about long development cycles, underlying complexity, or huge budgets. Iteratively enhance you solution to broaden your capabilities and drive more customer value. And, proven to 65M devices, you know the Flex83 platform is reliable! Give Flex83 a try today!Starting Price: $200 per month -
11
Jupyter Notebook
Project Jupyter
The Jupyter Notebook is an open-source web application that allows you to create and share documents that contain live code, equations, visualizations and narrative text. Uses include: data cleaning and transformation, numerical simulation, statistical modeling, data visualization, machine learning, and much more. -
12
Pentaho
Hitachi Vantara
With an integrated product suite providing data integration, analytics, cataloging, optimization and quality, Pentaho+ enables seamless data management, driving innovation and informed decision-making. Pentaho+ has helped customers achieve a 3x increase in improved data trust, a 7x increase in impactful business results and most importantly, a 70% increase in productivity. -
13
Apache Cassandra
Apache Software Foundation
The Apache Cassandra database is the right choice when you need scalability and high availability without compromising performance. Linear scalability and proven fault-tolerance on commodity hardware or cloud infrastructure make it the perfect platform for mission-critical data. Cassandra's support for replicating across multiple datacenters is best-in-class, providing lower latency for your users and the peace of mind of knowing that you can survive regional outages. -
14
SingleStore
SingleStore
SingleStore (formerly MemSQL) is a distributed, highly-scalable SQL database that can run anywhere. We deliver maximum performance for transactional and analytical workloads with familiar relational models. SingleStore is a scalable SQL database that ingests data continuously to perform operational analytics for the front lines of your business. Ingest millions of events per second with ACID transactions while simultaneously analyzing billions of rows of data in relational SQL, JSON, geospatial, and full-text search formats. SingleStore delivers ultimate data ingestion performance at scale and supports built in batch loading and real time data pipelines. SingleStore lets you achieve ultra fast query response across both live and historical data using familiar ANSI SQL. Perform ad hoc analysis with business intelligence tools, run machine learning algorithms for real-time scoring, perform geoanalytic queries in real time.Starting Price: $0.69 per hour -
15
Cleo Integration Cloud (CIC) award-winning EDI software that enables the best B2B integration, visibility and control. CIC accelerates EDI automation, expedites partner onboarding, and easily tackles EDI issue resolution. Bringing end-to-end integration visibility across EDI, non-EDI, and API integrations enabling you to grow your revenue-generating business processes better and faster. CIC is optimizing thousands of supply chains for logistics providers, manufacturers, and wholesalers. Encompassing seamless ERP integration, WMS integration, TMS integration and more, our cloud-based B2B integration platform transforms costly, complicated processes into truly efficient, agile, and scalable operations. Our ecosystem integration approach offers the best B2B capabilities so you can automate EDI and API transactions, rapidly onboard partners, and gain competitive control.
-
16
Continuous delivery of any application to any environment. IBM DevOps Deploy (formerly IBM UrbanCode Deploy) is an application-release solution that combines continuous delivery and deployment automation with robust visibility, traceability and auditing capabilities. Increase frequency of software delivery through automated, repeatable deployment processes across development, testing and production. Simplify the deployment of multichannel applications to all environments, whether on premises or in the cloud, with consistency and repeatability. Use a single centralized server to manage tens of thousands of endpoints to any number of clouds, data centers or mainframes. Make processes more robust and easier to design by using tested integrations with dozens of tools and technologies, including Jira, Jenkins, Kubernetes, Microsoft, ServiceNow and WebSphere.
-
17
Qlik Cloud Analytics
Qlik
The modern analytics era truly began with the launch of QlikView, our first analytics solution, and the game-changing associative engine it is built on. It revolutionized the way organizations use data with intuitive visual discovery that put business intelligence in the hands of more people than ever. And we continue to lead the way with Qlik Cloud® Analytics for a cloud-based SaaS deployment and Qlik Sense® for an on-premises solution. Both options augment and enhance human intuition with AI-powered insights, and help your team move from passive to active analytics for real-time collaboration and action. Take advantage of analytics in the cloud and on-premises. You get maximum choice and deployment flexibility when deciding where to store, transform, and analyze your data. -
18
ER/Studio Enterprise Edition
ER/Studio
ER/Studio is an enterprise data modeling and architecture platform that enables organizations to design, manage, and govern data assets across complex, distributed environments, including data warehouses, lakehouses, data mesh frameworks, and data vault architectures. It connects business requirements to technical implementation through conceptual, logical, and physical models, providing clarity from strategy through deployment. By establishing a consistent modeling foundation, ER/Studio creates a reliable, shared view of enterprise data that supports analytics, AI initiatives, modernization, compliance, and operational systems. Design data models and keep teams aligned with ER/Studio’s multi-user shared repository and web-based collaboration portal, Team Server. The repository supports version control, role-based access, parallel development, and change tracking so modelers can work simultaneously without conflict, preserving integrity and full history.Starting Price: $2,687 per user -
19
StarTree
StarTree
StarTree, powered by Apache Pinot™, is a fully managed real-time analytics platform built for customer-facing applications that demand instant insights on the freshest data. Unlike traditional data warehouses or OLTP databases—optimized for back-office reporting or transactions—StarTree is engineered for real-time OLAP at true scale, meaning: - Data Volume: query performance sustained at petabyte scale - Ingest Rates: millions of events per second, continuously indexed for freshness - Concurrency: thousands to millions of simultaneous users served with sub-second latency With StarTree, businesses deliver always-fresh insights at interactive speed, enabling applications that personalize, monitor, and act in real time.Starting Price: Free -
20
SCIKIQ
DAAS Labs
An AI-powered data management platform that enables true data democratization. Integrates & centralizes all data sources, facilitates collaboration, and empowers organizations for innovation, driven by Insights. SCIKIQ is a holistic business data platform that simplifies data complexities from business users through a no-code, drag-and-drop user interface which allows businesses to focus on driving value from data, thereby enabling them to grow, and make faster and smarter decisions with confidence. Use box integration, connect any data source, and ingest any structured and unstructured data. Build for business users, ease of use, a simple no-code platform, and use drag and drop to manage your data. Self-learning platform. Cloud agnostic, environment agnostic. Build on top of any data environment. SCIKIQ architecture is designed specifically to address the challenges facing the complex hybrid data landscape.Starting Price: $10,000 per year -
21
Trino
Trino
Trino is a query engine that runs at ludicrous speed. Fast-distributed SQL query engine for big data analytics that helps you explore your data universe. Trino is a highly parallel and distributed query engine, that is built from the ground up for efficient, low-latency analytics. The largest organizations in the world use Trino to query exabyte-scale data lakes and massive data warehouses alike. Supports diverse use cases, ad-hoc analytics at interactive speeds, massive multi-hour batch queries, and high-volume apps that perform sub-second queries. Trino is an ANSI SQL-compliant query engine, that works with BI tools such as R, Tableau, Power BI, Superset, and many others. You can natively query data in Hadoop, S3, Cassandra, MySQL, and many others, without the need for complex, slow, and error-prone processes for copying the data. Access data from multiple systems within a single query.Starting Price: Free -
22
Style Intelligence
InetSoft
Style Intelligence by InetSoft is a complete business intelligence (BI) software platform that empowers companies to explore, analyze, monitor, report, and collaborate on critical business and operational data from disparate sources in real time. Its top features include a real-time data mashup Data Block architecture, professional atomic data block modeling tool, and database write-back option. Robust and easy to use, Style Intelligence is also fully scalable and offers granular security, multi-tenancy support, and multiple integrations. InetSoft's cloud flexible business intelligence solution delivers the benefit of cloud computing and software-as-a-service while giving you the maximum level of control. In terms of software-as-a-service, BI software is unique because it inherently depends on the data not being embedded in the application. InetSoft provides free expert fast-start mentoring that delivers the expertise even when no in-house dedicated BI expert is available.Starting Price: $165/month -
23
DreamFactory
DreamFactory Software
DreamFactory Software is the fastest way to build secure, internal REST APIs. Instantly generate APIs from any database with built-in enterprise security controls that operates on-premises, air-gapped, or in the cloud. Develop 4x faster, save 70% on new projects, remove project management uncertainty, focus talent on truly critical issues, win more clients, and integrate with newer & legacy technologies instantly as needed. DreamFactory is the easiest and fastest way to automatically generate, publish, manage, and secure REST APIs, convert SOAP to REST, and aggregate disparate data sources through a single API platform. See why companies like Disney, Bosch, Netgear, T-Mobile, Intel, and many more are embracing DreamFactory's innovative platform to get a competitive edge. Start a hosted trial or talk to our engineers to get access to an on-prem environment!Starting Price: $1500/month -
24
Toucan
Toucan
Toucan is a customer-facing analytics platform that empowers organizations to drive engagement with the best end-user experience. From data connections to the distribution of insights anywhere they're needed, Toucan makes it easy. As a result, Toucan analytics are used 3x more than the industry average. Users can connect to any data, cloud-based or other, streaming or stored, with hundreds of connectors. Preparation of data is equally simple with data readiness features that lets business people perform tasks that would ordinarily require an expert. Visualization takes the form of “data storytelling” where every chart is accompanied by context, collaboration, and annotation so that users understand the “why” and not just the “what” of their data. Finally, deployment and management are made easy with one-touch deployment from staging to production, easy embedding, and publishing to any device. -
25
Bacula Enterprise
Bacula Systems
Bacula Enterprise delivers Physical, Virtual, Container and Hybrid Cloud Backup & Recovery software for the Modern Data Center - all from a single platform. Designed for medium and large organizations, Bacula Enterprise backup and recovery software brings unique innovation, modern architecture, business value benefits and low cost of ownership. Bacula Enterprise corporate data backup software solution uses exclusive technologies that increase the interoperability, power, flexibility and functionality of Bacula Enterprise into a wide range of IT environments such as enterprise data centers, managed service providers, software vendors or cloud providers. Thousands of organizations worldwide use Bacula Enterprise in mission-critical environments, including NASA, Texas A&M University, Unicredit, Swisscom, Sky, and many more. Bacula provides additional security features over other vendors and offers advanced, hybrid Cloud connectivity to Amazon, S3, Google, Oracle and many more. -
26
IBM StreamSets
IBM
IBM® StreamSets enables users to create and manage smart streaming data pipelines through an intuitive graphical interface, facilitating seamless data integration across hybrid and multicloud environments. This is why leading global companies rely on IBM StreamSets to support millions of data pipelines for modern analytics, intelligent applications and hybrid integration. Decrease data staleness and enable real-time data at scale—handling millions of records of data, across thousands of pipelines within seconds. Insulate data pipelines from change and unexpected shifts with drag-and-drop, prebuilt processors designed to automatically identify and adapt to data drift. Create streaming pipelines to ingest structured, semistructured or unstructured data and deliver it to a wide range of destinations.Starting Price: $1000 per month -
27
Prometheus
Prometheus
Power your metrics and alerting with a leading open-source monitoring solution. Prometheus fundamentally stores all data as time series: streams of timestamped values belonging to the same metric and the same set of labeled dimensions. Besides stored time series, Prometheus may generate temporary derived time series as the result of queries. Prometheus provides a functional query language called PromQL (Prometheus Query Language) that lets the user select and aggregate time series data in real time. The result of an expression can either be shown as a graph, viewed as tabular data in Prometheus's expression browser, or consumed by external systems via the HTTP API. Prometheus is configured via command-line flags and a configuration file. While the command-line flags configure immutable system parameters (such as storage locations, amount of data to keep on disk and in memory, etc.). Download: https://sourceforge.net/projects/prometheus.mirror/Starting Price: Free -
28
Enterprise Recon
Ground Labs
With Enterprise Recon by Ground Labs, organizations can find and remediate sensitive information across the broadest range of structured and unstructured data, whether it’s stored on your servers, on your employees’ devices, or in the cloud. Enterprise Recon enables organizations worldwide to seamlessly discover all data and comply with 50+ country regulations inc GDPR, PCI DSS, CCPA, HIPAA, Australian Privacy and other data security standards that require the ability to locate and secure PII data as well as information on gender, ethnicity and health… or even non-PII financial data. Enterprise Recon is powered by GLASS™, Ground Labs' proprietary technology that enables the quickest and most accurate data discovery across the broadest set of platforms available. Enterprise Recon natively supports sensitive data discovery on Windows, macOS, Linux, FreeBSD, Solaris, HP-UX and IBM AIX using agent and agentless options. Additional remote options also enable almost any network data stored. -
29
IRI DMaaS
IRI, The CoSort Company
Data may be the most important asset, and risk, that your company holds. It describes customers, products, transaction histories, and everything else that you use and plan in business. This data can be in databases, files, spreadsheets, Hadoop, cloud platforms or apps. If you don't have the time or expertise to find and de-identify the personally identifiable information (PII) in those sources yourself, IRI Data Masking as a Service (DMaaS) can help. With IRI DMaaS, you can minimize risk and cost because you only pay for the data you need protected. IRI can do all, or some, of the work to classify, find, and mask that data. IRI can also provide your auditors with the logs and targets that verify that your sensitive data was protected and now complies with privacy laws. To facilitate the service, you can transfer unprotected data to a secure on-premise or cloud-based staging area, or provide remote, supervised access to IRI to the data sources(s) at issue under a strict NDA.Starting Price: $1000 per day -
30
Quobyte
Quobyte
With Quobyte’s high-performance file and object storage you have the freedom to deploy anywhere (any server, any cloud), scale performance and manage large amounts of data while simplifying administration. Quobyte was designed with one goal in mind, to make your life easier. That’s why we make storage simple with a straightforward download and install (no tedious or complex configuration, no kernel modules), allowing ease of management. The ability to deploy anywhere means that you get to choose where you install your software storage solution. Whether it’s on new or existing hardware, in the cloud, or a combination of the two, Quobyte lets you pick what works best for your needs. From software updates to adding or removing nodes, everything is 100% non-disruptive in Quobyte. That way you can work whenever it’s convenient for you. So say goodbye to maintenance windows and say hello to free up your nights and weekends.Starting Price: $8,999 per year -
31
Hostmaster
Hostmaster
First-class reliable web hosting at affordable prices. Experience our speedy, robust servers, our feature-filled packages and our helpful support team 24/7, 365 days a year, all at a price you'll never believe! Host your personal or business website on our robust servers with our feature-packed shared hosting plans. Run your very own web hosting business with our all-inclusive reseller web hosting plans. Feel the benefit of our powerful servers, redundant network and our professional management team, keeping your data secure. All accounts are remotely backed up, every day. Manage every aspect of your client's web hosting experience with ease using cPanel's intuitive WebHostManager. Install advanced web scripts with the click of a button. Design a professional website in minutes, with 100+ fully customizable templates and our SiteBuilder. Our professional support team is available around the clock, every day of the year.Starting Price: $4.95 per month -
32
IBM Analytics Engine provides an architecture for Hadoop clusters that decouples the compute and storage tiers. Instead of a permanent cluster formed of dual-purpose nodes, the Analytics Engine allows users to store data in an object storage layer such as IBM Cloud Object Storage and spins up clusters of computing notes when needed. Separating compute from storage helps to transform the flexibility, scalability and maintainability of big data analytics platforms. Build on an ODPi compliant stack with pioneering data science tools with the broader Apache Hadoop and Apache Spark ecosystem. Define clusters based on your application's requirements. Choose the appropriate software pack, version, and size of the cluster. Use as long as required and delete as soon as an application finishes jobs. Configure clusters with third-party analytics libraries and packages. Deploy workloads from IBM Cloud services like machine learning.Starting Price: $0.014 per hour
-
33
Elastic Observability
Elastic
Rely on the most widely deployed observability platform available, built on the proven Elastic Stack (also known as the ELK Stack) to converge silos, delivering unified visibility and actionable insights. To effectively monitor and gain insights across your distributed systems, you need to have all your observability data in one stack. Break down silos by bringing together the application, infrastructure, and user data into a unified solution for end-to-end observability and alerting. Combine limitless telemetry data collection and search-powered problem resolution in a unified solution for optimal operational and business results. Converge data silos by ingesting all your telemetry data (metrics, logs, and traces) from any source in an open, extensible, and scalable platform. Accelerate problem resolution with automatic anomaly detection powered by machine learning and rich data analytics.Starting Price: $16 per month -
34
Dataplane
Dataplane
The concept behind Dataplane is to make it quicker and easier to construct a data mesh with robust data pipelines and automated workflows for businesses and teams of all sizes. In addition to being more user friendly, there has been an emphasis on scaling, resilience, performance and security.Starting Price: Free -
35
Normalyze
Normalyze
Our agentless data discovery and scanning platform is easy to connect to any cloud account (AWS, Azure and GCP). There is nothing for you to deploy or manage. We support all native cloud data stores, structured or unstructured, across all three clouds. Normalyze scans both structured and unstructured data within your cloud accounts and only collects metadata to add to the Normalyze graph. No sensitive data is collected at any point during scanning. Display a graph of access and trust relationships that includes deep context with fine-grained process names, data store fingerprints, IAM roles and policies in real-time. Quickly locate all data stores containing sensitive data, find all-access paths, and score potential breach paths based on sensitivity, volume, and permissions to show all breaches waiting to happen. Categorize and identify sensitive data-based industry profiles such as PCI, HIPAA, GDPR, etc.Starting Price: $14,995 per year -
36
Superblocks
Superblocks
Superblocks is a programmable IDE for developers to build any internal app, workflow, or scheduled job at a fraction of the time and cost. Ship next month's roadmap this week. Quickly build apps, workflows & jobs connected to your data. Secure with granular permissions (RBAC), SSO, audit logs, and secret management in seconds. Deploy with Git and monitor production. Extend anything with code. No need to learn React, HTML, or CSS. Drag and drop components, connect them to data and make your app dynamic by triggering APIs. Build custom KYC, Compliance, AML, and credit approval tools to enable robust support processes to increase the velocity of your support team. Stop wrestling with CLIs. Quickly build admin panels for your datastores to read, write, and update your customer data with tables, forms, and charts. Clark is an AI-powered app builder that helps teams quickly create secure internal enterprise applications using their own design systems, permissions, and integrations.Starting Price: $0 per month -
37
Dialogic OnDemand Voicemail
Dialogic
Dialogic OnDemand Voicemail is all software and can run in virtualized environments, allowing you to share resources and reduce service delivery costs. It minimizes the number of mailboxes needed by creating temporary resources that can be shared across subscribers while maintaining the same privacy and security standards as permanent mailboxes. Legacy platforms are also expensive to maintain and need extra space and power. Upgrading to a fully virtualized, the on-demand platform will lower your operational costs without compromising service. And with an easy-to-use interface that is designed to enhance your subscribers’ self-service abilities, your customer care costs will be reduced too. Enable dynamic and temporary voicemailboxes. Assign mailbox to the customer only when needed. Reduce the number of voicemail boxes and cost. Access anywhere and on any device. Give your voicemail service a new look by making it visual, and give customers the latest features at the same time.Starting Price: Free -
38
muCommander
muCommander
muCommander is an open-source, dual-pane file manager available on all major operating systems. Copy, move, rename and batch rename, email files. Multiple tabs and universal bookmarks. Credentials manager. Configurable keyboard shortcuts. Cloud storage Dropbox and Google Drive. Virtual filesystem with support for local volumes, FTP, SFTP, SMB, NFS, HTTP, Amazon S3, Hadoop HDFS, and Bonjour. Archives ZIP, RAR, 7z, TAR, GZip, BZip2, ISO/NRG, AR/Deb, LST. Checksum calculation. Fully customizable user interface, configurable toolbars, and themes. Available in many languages. muCommander is a lightweight, cross-platform file manager with a dual-pane interface. Java 11 or later is required to run muCommander. Report bugs, suggest new features, answer questions, write documentation, create video tutorials or translate the user interface. In order to start Open Office, you need to open the "natively" (mapped to shift-enter by default) document.Starting Price: Free -
39
ELCA Smart Data Lake Builder
ELCA Group
Classical Data Lakes are often reduced to basic but cheap raw data storage, neglecting significant aspects like transformation, data quality and security. These topics are left to data scientists, who end up spending up to 80% of their time acquiring, understanding and cleaning data before they can start using their core competencies. In addition, classical Data Lakes are often implemented by separate departments using different standards and tools, which makes it harder to implement comprehensive analytical use cases. Smart Data Lakes solve these various issues by providing architectural and methodical guidelines, together with an efficient tool to build a strong high-quality data foundation. Smart Data Lakes are at the core of any modern analytics platform. Their structure easily integrates prevalent Data Science tools and open source technologies, as well as AI and ML. Their storage is cheap and scalable, supporting both unstructured data and complex data structures.Starting Price: Free -
40
Akira AI
Akira AI
Akira.ai provides businesses with Agentic AI, a set of specialized AI agents designed to optimize and automate complex workflows across various industries. These AI agents collaborate with human teams, enhancing productivity, making real-time decisions, and automating repetitive tasks, such as data analysis, incident management, and HR processes. The platform integrates smoothly with existing systems, including CRMs and ERPs, ensuring a disruption-free transition to AI-enhanced operations. Akira’s AI agents help businesses streamline their operations, increase decision-making speed, and boost overall efficiency, driving innovation across sectors like manufacturing, finance, and IT.Starting Price: $15 per month -
41
Wherobots
Wherobots
Wherobots, the Spatial Intelligence Cloud, enables any data team to analyze data about the physical world faster, at greater scale, and at lower cost compared to traditional solutions. Built by the creators of Apache Sedona, it's a compute lakehouse engine that unifies spatial and non-spatial data, automates data workflows, and runs AI on planetary scale imagery. Spatial data refers to information about places, objects, or activities. Examples include GPS points and tracks, routes, land, road, parcel, crop, and building data, as well as imagery from drones and satellites. This data is fundamental to various industries including aerospace, mobility, ag-tech, insurance, energy, telecommunications, retail, and logistics. In one solution, Wherobots handles these diverse spatial data types and formats, with customers seeing production workloads run up to 20x faster and at lower cost than popular lakehouse engines. -
42
Scalytics Connect
Scalytics
Scalytics Connect enables AI and ML to process and analyze data, makes it easier and more secure to use different data processing platforms at the same time. Built by the inventors of Apache Wayang, Scalytics Connect is the most enhanced data management platform, reducing the complexity of ETL data pipelines dramatically. Scalytics Connect is a data management and ETL platform that helps organizations unlock the power of their data, regardless of where it resides. It empowers businesses to break down data silos, simplify access, and gain valuable insights through a variety of features, including: - AI-powered ETL: Automates tasks like data extraction, transformation, and loading, freeing up your resources for more strategic work. - Unified Data Landscape: Breaks down data silos and provides a holistic view of all your data, regardless of its location or format. - Effortless Scaling: Handles growing data volumes with ease, so you never get bottlenecked by information overloadStarting Price: $0 -
43
Indexima Data Hub
Indexima
Reshape your perception of time in data analytics. Instantly access your business’ data in no time and work directly on your dashboard without going back and forth with the IT team. Meet Indexima DataHub, a new space-time where operational and functional users gain instant access to their data, in no time. With a combination of its unique indexing engine and machine learning, Indexima allows businesses to access all their data to simplify and speed up analytics. Robust and scalable, the solution allows organizations to query all their data directly at the source, in volumes of tens of billions of rows in just a few milliseconds. Our Indexima platform allows users to implement instant analytics on all their data in just one click. Thanks to Indexima’s new ROI and TCO calculator, find out in 30 seconds the ROI of your data platform. Infrastructure costs, project deployment time, and data engineering costs, while boosting your analytical performances.Starting Price: $3,290 per month -
44
Yandex Data Proc
Yandex
You select the size of the cluster, node capacity, and a set of services, and Yandex Data Proc automatically creates and configures Spark and Hadoop clusters and other components. Collaborate by using Zeppelin notebooks and other web apps via a UI proxy. You get full control of your cluster with root permissions for each VM. Install your own applications and libraries on running clusters without having to restart them. Yandex Data Proc uses instance groups to automatically increase or decrease computing resources of compute subclusters based on CPU usage indicators. Data Proc allows you to create managed Hive clusters, which can reduce the probability of failures and losses caused by metadata unavailability. Save time on building ETL pipelines and pipelines for training and developing models, as well as describing other iterative tasks. The Data Proc operator is already built into Apache Airflow.Starting Price: $0.19 per hour -
45
Apache Impala
Apache
Impala provides low latency and high concurrency for BI/analytic queries on the Hadoop ecosystem, including Iceberg, open data formats, and most cloud storage options. Impala also scales linearly, even in multitenant environments. Impala is integrated with native Hadoop security and Kerberos for authentication, and via the Ranger module, you can ensure that the right users and applications are authorized for the right data. Utilize the same file and data formats and metadata, security, and resource management frameworks as your Hadoop deployment, with no redundant infrastructure or data conversion/duplication. For Apache Hive users, Impala utilizes the same metadata and ODBC driver. Like Hive, Impala supports SQL, so you don't have to worry about reinventing the implementation wheel. With Impala, more users, whether using SQL queries or BI applications, can interact with more data through a single repository and metadata stored from source through analysis.Starting Price: Free -
46
Apache Phoenix
Apache Software Foundation
Apache Phoenix enables OLTP and operational analytics in Hadoop for low-latency applications by combining the best of both worlds. The power of standard SQL and JDBC APIs with full ACID transaction capabilities and the flexibility of late-bound, schema-on-read capabilities from the NoSQL world by leveraging HBase as its backing store. Apache Phoenix is fully integrated with other Hadoop products such as Spark, Hive, Pig, Flume, and Map Reduce. Become the trusted data platform for OLTP and operational analytics for Hadoop through well-defined, industry-standard APIs. Apache Phoenix takes your SQL query, compiles it into a series of HBase scans, and orchestrates the running of those scans to produce regular JDBC result sets. Direct use of the HBase API, along with coprocessors and custom filters, results in performance on the order of milliseconds for small queries, or seconds for tens of millions of rows.Starting Price: Free -
47
Inferyx
Inferyx
Move past application silos, cost overrun, and skill obsolescence to scale faster with our intelligent data and analytics platform. An intelligent platform built to perform data management and advanced analytics. Helps you scale across the technology landscape. Our architecture understands how data flows and transforms throughout its lifecycle. Enabling the development of future-proof enterprise AI applications. A highly modular and extensible platform that enables the handling of multifold components. Designed to scale with a multi-tenant architecture. Analyzing complex data structures is made easy using advanced data visualization. Resulting in enhanced enterprise AI app development in an intuitive and low-code predictive platform. Our unique hybrid multi-cloud platform is built using open source community software which makes it immensely adaptive, highly secure, and essentially low-cost.Starting Price: Free -
48
Apache Trafodion
Apache Software Foundation
Apache Trafodion is a webscale SQL-on-Hadoop solution enabling transactional or operational workloads on Apache Hadoop. Trafodion builds on the scalability, elasticity, and flexibility of Hadoop. Trafodion extends Hadoop to provide guaranteed transactional integrity, enabling new kinds of big data applications to run on Hadoop. Full-functioned ANSI SQL language support. JDBC/ODBC connectivity for Linux/Windows clients. Distributed ACID transaction protection across multiple statements, tables, and rows. Performance improvements for OLTP workloads with compile-time and run-time optimizations. Support for large data sets using a parallel-aware query optimizer. Reuse existing SQL skills and improve developer productivity. Distributed ACID transactions guarantee data consistency across multiple rows and tables. Interoperability with existing tools and applications. Hadoop and Linux distribution neutral. Easy to add to your existing Hadoop infrastructure.Starting Price: Free -
49
Alteryx
Alteryx
Step into a new era of analytics with the Alteryx AI Platform. Empower your organization with automated data preparation, AI-powered analytics, and approachable machine learning — all with embedded governance and security. Welcome to the future of data-driven decisions for every user, every team, every step of the way. Empower your teams with an easy, intuitive user experience allowing everyone to create analytic solutions that improve productivity, efficiency, and the bottom line. Build an analytics culture with an end-to-end cloud analytics platform and transform data into insights with self-service data prep, machine learning, and AI-generated insights. Reduce risk and ensure your data is fully protected with the latest security standards and certifications. Connect to your data and applications with open API standards. -
50
OpenText Analytics Database is a high-performance, scalable analytics platform that enables organizations to analyze massive data sets quickly and cost-effectively. It supports real-time analytics and in-database machine learning to deliver actionable business insights. The platform can be deployed flexibly across hybrid, multi-cloud, and on-premises environments to optimize infrastructure and reduce total cost of ownership. Its massively parallel processing (MPP) architecture handles complex queries efficiently, regardless of data size. OpenText Analytics Database also features compatibility with data lakehouse architectures, supporting formats like Parquet and ORC. With built-in machine learning and broad language support, it empowers users from SQL experts to Python developers to derive predictive insights.