Alternatives to Amazon Athena
Compare Amazon Athena alternatives for your business or organization using the curated list below. SourceForge ranks the best alternatives to Amazon Athena in 2025. Compare features, ratings, user reviews, pricing, and more from Amazon Athena competitors and alternatives in order to make an informed decision for your business.
-
1
Google Cloud BigQuery
Google
BigQuery is a serverless, multicloud data warehouse that simplifies the process of working with all types of data so you can focus on getting valuable business insights quickly. At the core of Google’s data cloud, BigQuery allows you to simplify data integration, cost effectively and securely scale analytics, share rich data experiences with built-in business intelligence, and train and deploy ML models with a simple SQL interface, helping to make your organization’s operations more data-driven. Gemini in BigQuery offers AI-driven tools for assistance and collaboration, such as code suggestions, visual data preparation, and smart recommendations designed to boost efficiency and reduce costs. BigQuery delivers an integrated platform featuring SQL, a notebook, and a natural language-based canvas interface, catering to data professionals with varying coding expertise. This unified workspace streamlines the entire analytics process. -
2
MongoDB Atlas
MongoDB
The most innovative cloud database service on the market, with unmatched data distribution and mobility across AWS, Azure, and Google Cloud, built-in automation for resource and workload optimization, and so much more. MongoDB Atlas is the global cloud database service for modern applications. Deploy fully managed MongoDB across AWS, Google Cloud, and Azure with best-in-class automation and proven practices that guarantee availability, scalability, and compliance with the most demanding data security and privacy standards. The best way to deploy, run, and scale MongoDB in the cloud. MongoDB Atlas offers built-in security controls for all your data. Enable enterprise-grade features to integrate with your existing security protocols and compliance standards. With MongoDB Atlas, your data is protected with preconfigured security features for authentication, authorization, encryption, and more. -
3
StarTree
StarTree
StarTree Cloud is a fully-managed real-time analytics platform designed for OLAP at massive speed and scale for user-facing applications. Powered by Apache Pinot, StarTree Cloud provides enterprise-grade reliability and advanced capabilities such as tiered storage, scalable upserts, plus additional indexes and connectors. It integrates seamlessly with transactional databases and event streaming platforms, ingesting data at millions of events per second and indexing it for lightning-fast query responses. StarTree Cloud is available on your favorite public cloud or for private SaaS deployment. • Gain critical real-time insights to run your business • Seamlessly integrate data streaming and batch data • High performance in throughput and low-latency at petabyte scale • Fully-managed cloud service • Tiered storage to optimize cloud performance & spend • Fully-secure & enterprise-ready -
4
Ninox
Ninox Software
Ninox is your solution for organizing and managing complex data in a structured and efficient way. With its highly flexible user interface, you can analyze, process, and evaluate any type of data. Additionally, the Ninox API enables seamless integration with services like Google for enhanced functionality. Designed to work across all platforms, Ninox is available via native apps for macOS, iOS, and Android, as well as through any web browser. The platform empowers users to build custom applications using templates, drag-and-drop formulas, and scripting tools. Its intuitive visual editor simplifies the creation of triggers, fields, and custom forms. With real-time syncing, Ninox ensures a smooth and consistent experience, whether you're working on a single device or switching between multiple devices. -
5
AWS Glue
Amazon
AWS Glue is a serverless data integration service that makes it easy to discover, prepare, and combine data for analytics, machine learning, and application development. AWS Glue provides all the capabilities needed for data integration so that you can start analyzing your data and putting it to use in minutes instead of months. Data integration is the process of preparing and combining data for analytics, machine learning, and application development. It involves multiple tasks, such as discovering and extracting data from various sources; enriching, cleaning, normalizing, and combining data; and loading and organizing data in databases, data warehouses, and data lakes. These tasks are often handled by different types of users that each use different products. AWS Glue runs in a serverless environment. There is no infrastructure to manage, and AWS Glue provisions, configures, and scales the resources required to run your data integration jobs. -
6
Amazon ElastiCache
Amazon
Amazon ElastiCache allows you to seamlessly set up, run, and scale popular open-Source compatible in-memory data stores in the cloud. Build data-intensive apps or boost the performance of your existing databases by retrieving data from high throughput and low latency in-memory data stores. Amazon ElastiCache is a popular choice for real-time use cases like Caching, Session Stores, Gaming, Geospatial Services, Real-Time Analytics, and Queuing. Amazon ElastiCache offers fully managed Redis and Memcached for your most demanding applications that require sub-millisecond response times. Amazon ElastiCache works as an in-memory data store and cache to support the most demanding applications requiring sub-millisecond response times. By utilizing an end-to-end optimized stack running on customer-dedicated nodes, Amazon ElastiCache provides secure, blazing-fast performance. -
7
Amazon RDS
Amazon
Amazon Relational Database Service (Amazon RDS) makes it easy to set up, operate, and scale a relational database in the cloud. It provides cost-efficient and resizable capacity while automating time-consuming administration tasks such as hardware provisioning, database setup, patching and backups. It frees you to focus on your applications so you can give them the fast performance, high availability, security and compatibility they need. Amazon RDS is available on several database instance types - optimized for memory, performance or I/O - and provides you with six familiar database engines to choose from, including Amazon Aurora, PostgreSQL, MySQL, MariaDB, Oracle Database, and SQL Server. You can use the AWS Database Migration Service to easily migrate or replicate your existing databases to Amazon RDS.Starting Price: $0.01 per month -
8
Amazon DynamoDB
Amazon
Amazon DynamoDB is a key-value and document database that delivers single-digit millisecond performance at any scale. It's a fully managed, multi-region, Multimaster, durable database with built-in security, backup and restore, and in-memory caching for internet-scale applications. DynamoDB can handle more than 10 trillion requests per day and can support peaks of more than 20 million requests per second. Many of the world's fastest-growing businesses such as Lyft, Airbnb, and Redfin as well as enterprises such as Samsung, Toyota, and Capital One depend on the scale and performance of DynamoDB to support their mission-critical workloads. Focus on driving innovation with no operational overhead. Build out your game platform with player data, session history, and leaderboards for millions of concurrent users. Use design patterns for deploying shopping carts, workflow engines, inventory tracking, and customer profiles. DynamoDB supports high-traffic, extreme-scaled events. -
9
Snowflake
Snowflake
Snowflake is a comprehensive AI Data Cloud platform designed to eliminate data silos and simplify data architectures, enabling organizations to get more value from their data. The platform offers interoperable storage that provides near-infinite scale and access to diverse data sources, both inside and outside Snowflake. Its elastic compute engine delivers high performance for any number of users, workloads, and data volumes with seamless scalability. Snowflake’s Cortex AI accelerates enterprise AI by providing secure access to leading large language models (LLMs) and data chat services. The platform’s cloud services automate complex resource management, ensuring reliability and cost efficiency. Trusted by over 11,000 global customers across industries, Snowflake helps businesses collaborate on data, build data applications, and maintain a competitive edge.Starting Price: $2 compute/month -
10
Dremio
Dremio
Dremio delivers lightning-fast queries and a self-service semantic layer directly on your data lake storage. No moving data to proprietary data warehouses, no cubes, no aggregation tables or extracts. Just flexibility and control for data architects, and self-service for data consumers. Dremio technologies like Data Reflections, Columnar Cloud Cache (C3) and Predictive Pipelining work alongside Apache Arrow to make queries on your data lake storage very, very fast. An abstraction layer enables IT to apply security and business meaning, while enabling analysts and data scientists to explore data and derive new virtual datasets. Dremio’s semantic layer is an integrated, searchable catalog that indexes all of your metadata, so business users can easily make sense of your data. Virtual datasets and spaces make up the semantic layer, and are all indexed and searchable. -
11
Trino
Trino
Trino is a query engine that runs at ludicrous speed. Fast-distributed SQL query engine for big data analytics that helps you explore your data universe. Trino is a highly parallel and distributed query engine, that is built from the ground up for efficient, low-latency analytics. The largest organizations in the world use Trino to query exabyte-scale data lakes and massive data warehouses alike. Supports diverse use cases, ad-hoc analytics at interactive speeds, massive multi-hour batch queries, and high-volume apps that perform sub-second queries. Trino is an ANSI SQL-compliant query engine, that works with BI tools such as R, Tableau, Power BI, Superset, and many others. You can natively query data in Hadoop, S3, Cassandra, MySQL, and many others, without the need for complex, slow, and error-prone processes for copying the data. Access data from multiple systems within a single query.Starting Price: Free -
12
Amazon EMR
Amazon
Amazon EMR is the industry-leading cloud big data platform for processing vast amounts of data using open-source tools such as Apache Spark, Apache Hive, Apache HBase, Apache Flink, Apache Hudi, and Presto. With EMR you can run Petabyte-scale analysis at less than half of the cost of traditional on-premises solutions and over 3x faster than standard Apache Spark. For short-running jobs, you can spin up and spin down clusters and pay per second for the instances used. For long-running workloads, you can create highly available clusters that automatically scale to meet demand. If you have existing on-premises deployments of open-source tools such as Apache Spark and Apache Hive, you can also run EMR clusters on AWS Outposts. Analyze data using open-source ML frameworks such as Apache Spark MLlib, TensorFlow, and Apache MXNet. Connect to Amazon SageMaker Studio for large-scale model training, analysis, and reporting. -
13
Apache Drill
The Apache Software Foundation
Schema-free SQL Query Engine for Hadoop, NoSQL and Cloud Storage -
14
Amazon QuickSight
Amazon
Amazon QuickSight allows everyone in your organization to understand your data by asking questions in natural language, exploring through interactive dashboards, or automatically looking for patterns and outliers powered by machine learning. QuickSight powers millions of dashboard views weekly for customers such as the NFL, Expedia, Volvo, Thomson Reuters, Best Western and Comcast, allowing their end-users to make better data-driven decisions. Ask conversational questions of your data and use Q’s ML-powered engine to receive relevant visualizations without the time-consuming data preparation from authors and admins. Discover hidden insights from your data, perform accurate forecasting and what-if analysis, or add easy-to-understand natural language narratives to dashboards by leveraging AWS' expertise in machine learning. Easily embed interactive visualizations and dashboards, sophisticated dashboard authoring, or natural language query capabilities in your applications. -
15
SpectX
SpectX
SpectX is a powerful log analyzer for incident investigation and data exploration. It does not ingest or index data but runs queries directly on log files stored in file systems or blob storage. Local log servers, cloud storage, Hadoop clusters, JDBC-databases, production servers, Elastic clusters, or anything that speaks HTTP - SpectX turns any text-based log files into structured virtual views. SpectX query language is inspired by piping in Unix. An extensive library of built-in query functions allows analysts to compose complex queries and get advanced insights. In addition to the browser-based interface, every query can be easily executed via RESTful API, with advanced options to customize the resultset. This makes it easy to integrate SpectX with other applications in need of clean and structured data. SpectX easy-to-read pattern matching language can flexibly match any data, no need to read or write regex.Starting Price: $79/month -
16
Amazon Timestream
Amazon
Amazon Timestream is a fast, scalable, and serverless time series database service for IoT and operational applications that makes it easy to store and analyze trillions of events per day up to 1,000 times faster and at as little as 1/10th the cost of relational databases. Amazon Timestream saves you time and cost in managing the lifecycle of time series data by keeping recent data in memory and moving historical data to a cost optimized storage tier based upon user defined policies. Amazon Timestream’s purpose-built query engine lets you access and analyze recent and historical data together, without needing to specify explicitly in the query whether the data resides in the in-memory or cost-optimized tier. Amazon Timestream has built-in time series analytics functions, helping you identify trends and patterns in your data in near real-time. -
17
Tabular
Tabular
Tabular is an open table store from the creators of Apache Iceberg. Connect multiple computing engines and frameworks. Decrease query time and storage costs by up to 50%. Centralize enforcement of data access (RBAC) policies. Connect any query engine or framework, including Athena, BigQuery, Redshift, Snowflake, Databricks, Trino, Spark, and Python. Smart compaction, clustering, and other automated data services reduce storage costs and query times by up to 50%. Unify data access at the database or table. RBAC controls are simple to manage, consistently enforced, and easy to audit. Centralize your security down to the table. Tabular is easy to use plus it features high-powered ingestion, performance, and RBAC under the hood. Tabular gives you the flexibility to work with multiple “best of breed” compute engines based on their strengths. Assign privileges at the data warehouse database, table, or column level.Starting Price: $100 per month -
18
Apache Impala
Apache
Impala provides low latency and high concurrency for BI/analytic queries on the Hadoop ecosystem, including Iceberg, open data formats, and most cloud storage options. Impala also scales linearly, even in multitenant environments. Impala is integrated with native Hadoop security and Kerberos for authentication, and via the Ranger module, you can ensure that the right users and applications are authorized for the right data. Utilize the same file and data formats and metadata, security, and resource management frameworks as your Hadoop deployment, with no redundant infrastructure or data conversion/duplication. For Apache Hive users, Impala utilizes the same metadata and ODBC driver. Like Hive, Impala supports SQL, so you don't have to worry about reinventing the implementation wheel. With Impala, more users, whether using SQL queries or BI applications, can interact with more data through a single repository and metadata stored from source through analysis.Starting Price: Free -
19
DuckDB
DuckDB
Processing and storing tabular datasets, e.g. from CSV or Parquet files. Large result set transfer to client. Large client/server installations for centralized enterprise data warehousing. Writing to a single database from multiple concurrent processes. DuckDB is a relational database management system (RDBMS). That means it is a system for managing data stored in relations. A relation is essentially a mathematical term for a table. Each table is a named collection of rows. Each row of a given table has the same set of named columns, and each column is of a specific data type. Tables themselves are stored inside schemas, and a collection of schemas constitutes the entire database that you can access. -
20
Amazon SimpleDB
Amazon
Amazon SimpleDB is a highly available NoSQL data store that offloads the work of database administration. Developers simply store and query data items via web services requests and Amazon SimpleDB does the rest. Unbound by the strict requirements of a relational database, Amazon SimpleDB is optimized to provide high availability and flexibility, with little or no administrative burden. Behind the scenes, Amazon SimpleDB creates and manages multiple geographically distributed replicas of your data automatically to enable high availability and data durability. The service charges you only for the resources actually consumed in storing your data and serving your requests. You can change your data model on the fly, and data is automatically indexed for you. With Amazon SimpleDB, you can focus on application development without worrying about infrastructure provisioning, high availability, software maintenance, schema and index management, or performance tuning. -
21
SSuite MonoBase Database
SSuite Office Software
Create relational or flat file databases with unlimited tables, fields, and rows. Includes a custom report builder. Interface with ODBC compatible databases and create custom reports for them. Create your own personal and custom databases. Some Highlights: - Filter tables instantly - Ultra simple graphical-user-interface - One click table and data form creation - Open up to 5 databases simultaneously - Export your data to comma separated files - Create custom reports for all your databases - Full helpfile to assist in creating database reports - Print tables and queries directly from the data grid - Supports any SQL standard that your ODBC compatible database requires Please install and run this database application with full administrator rights for best performance and user experience. Requires: . 1024x768 Display Size . Windows 98 / XP / 7 / 8 / 10 - 32bit and 64bit No Java or DotNet required. Green Energy Software. Saving the planet one bit at a time...Starting Price: Free -
22
Starburst Enterprise
Starburst Data
Starburst helps you make better decisions with fast access to all your data; Without the complexity of data movement and copies. Your company has more data than ever before, but your data teams are stuck waiting to analyze it. Starburst unlocks access to data where it lives, no data movement required, giving your teams fast & accurate access to more data for analysis. Starburst Enterprise is a fully supported, production-tested and enterprise-grade distribution of open source Trino (formerly Presto® SQL). It improves performance and security while making it easy to deploy, connect, and manage your Trino environment. Through connecting to any source of data – whether it’s located on-premise, in the cloud, or across a hybrid cloud environment – Starburst lets your team use the analytics tools they already know & love while accessing data that lives anywhere. -
23
Amazon DocumentDB
Amazon
Amazon DocumentDB (with MongoDB compatibility) is a fast, scalable, highly available, and fully managed document database service that supports MongoDB workloads. As a document database, Amazon DocumentDB makes it easy to store, query, and index JSON data. Amazon DocumentDB is a non-relational database service designed from the ground-up to give you the performance, scalability, and availability you need when operating mission-critical MongoDB workloads at scale. In Amazon DocumentDB, the storage and compute are decoupled, allowing each to scale independently, and you can increase the read capacity to millions of requests per second by adding up to 15 low latency read replicas in minutes, regardless of the size of your data. Amazon DocumentDB is designed for 99.99% availability and replicates six copies of your data across three AWS Availability Zones (AZs). -
24
ClickHouse
ClickHouse
ClickHouse is a fast open-source OLAP database management system. It is column-oriented and allows to generate analytical reports using SQL queries in real-time. ClickHouse's performance exceeds comparable column-oriented database management systems currently available on the market. It processes hundreds of millions to more than a billion rows and tens of gigabytes of data per single server per second. ClickHouse uses all available hardware to its full potential to process each query as fast as possible. Peak processing performance for a single query stands at more than 2 terabytes per second (after decompression, only used columns). In distributed setup reads are automatically balanced among healthy replicas to avoid increasing latency. ClickHouse supports multi-master asynchronous replication and can be deployed across multiple datacenters. All nodes are equal, which allows avoiding having single points of failure. -
25
QuasarDB
QuasarDB
Quasar's brain is QuasarDB, a high-performance, distributed, column-oriented timeseries database management system designed from the ground up to deliver real-time on petascale use cases. Up to 20X less disk usage. Quasardb ingestion and compression capabilities are unmatched. Up to 10,000X faster feature extraction. QuasarDB can extract features in real-time from the raw data, thanks to the combination of a built-in map/reduce query engine, an aggregation engine that leverages SIMD from modern CPUs, and stochastic indexes that use virtually no disk space. The most cost-effective timeseries solution, thanks to its ultra-efficient resource usage, the capability to leverage object storage (S3), unique compression technology, and fair pricing model. Quasar runs everywhere, from 32-bit ARM devices to high-end Intel servers, from Edge Computing to the cloud or on-premises. -
26
ksqlDB
Confluent
Now that your data is in motion, it’s time to make sense of it. Stream processing enables you to derive instant insights from your data streams, but setting up the infrastructure to support it can be complex. That’s why Confluent developed ksqlDB, the database purpose-built for stream processing applications. Make your data immediately actionable by continuously processing streams of data generated throughout your business. ksqlDB’s intuitive syntax lets you quickly access and augment data in Kafka, enabling development teams to seamlessly create real-time innovative customer experiences and fulfill data-driven operational needs. ksqlDB offers a single solution for collecting streams of data, enriching them, and serving queries on new derived streams and tables. That means less infrastructure to deploy, maintain, scale, and secure. With less moving parts in your data architecture, you can focus on what really matters -- innovation. -
27
CockroachDB
Cockroach Labs
CockroachDB: Cloud-native, distributed SQL. Your cloud applications deserve a cloud-native database. Cloud-based apps and services deserve a database that scales across clouds, eases operational complexity, and improves reliability. CockroachDB delivers resilient, distributed SQL with ACID transactions and data partitioned by location. Automate operations for mission-critical applications by pairing CockroachDB with orchestration tools like Kubernetes and Mesosphere DC/OS. Every node can service both reads and writes so that you can scale query throughput and database capacity by simply adding more endpoints. Just add new nodes to CockroachDB, and it automatically rebalances data, completely removing the pain of manual sharding. As demand shifts, CockroachDB detects hotspots and intelligently distributes data to maintain performance. Tune your database at the row level so that data lives close to your users and you can minimize query latency. -
28
Fauna
Fauna
Fauna is a data API for modern applications that facilitates rich clients with serverless backends by providing a web-native interface with support for GraphQL and custom business logic, frictionless integration with the serverless ecosystem, a no compromise multi-cloud architecture you can trust and grow with and total freedom from database operations. Instantly create multiple databases in one account leveraging multi-tenancy for development or customer-facing use case. Create a distributed database across one geography or the globe in just three clicks and easily import existing data. Scale seamlessly without ever managing servers, clusters, data partitioning, or replication. Track usage and consumption-based billing in near real time via a dashboard.Starting Price: Free -
29
ArangoDB
ArangoDB
Natively store data for graph, document and search needs. Utilize feature-rich access with one query language. Map data natively to the database and access it with the best patterns for the job – traversals, joins, search, ranking, geospatial, aggregations – you name it. Polyglot persistence without the costs. Easily design, scale and adapt your architectures to changing needs and with much less effort. Combine the flexibility of JSON with semantic search and graph technology for next generation feature extraction even for large datasets. -
30
DoubleCloud
DoubleCloud
Save time & costs by streamlining data pipelines with zero-maintenance open source solutions. From ingestion to visualization, all are integrated, fully managed, and highly reliable, so your engineers will love working with data. You choose whether to use any of DoubleCloud’s managed open source services or leverage the full power of the platform, including data storage, orchestration, ELT, and real-time visualization. We provide leading open source services like ClickHouse, Kafka, and Airflow, with deployment on Amazon Web Services or Google Cloud. Our no-code ELT tool allows real-time data syncing between systems, fast, serverless, and seamlessly integrated with your existing infrastructure. With our managed open-source data visualization you can simply visualize your data in real time by building charts and dashboards. We’ve designed our platform to make the day-to-day life of engineers more convenient.Starting Price: $0.024 per 1 GB per month -
31
Apache Spark
Apache Software Foundation
Apache Spark™ is a unified analytics engine for large-scale data processing. Apache Spark achieves high performance for both batch and streaming data, using a state-of-the-art DAG scheduler, a query optimizer, and a physical execution engine. Spark offers over 80 high-level operators that make it easy to build parallel apps. And you can use it interactively from the Scala, Python, R, and SQL shells. Spark powers a stack of libraries including SQL and DataFrames, MLlib for machine learning, GraphX, and Spark Streaming. You can combine these libraries seamlessly in the same application. Spark runs on Hadoop, Apache Mesos, Kubernetes, standalone, or in the cloud. It can access diverse data sources. You can run Spark using its standalone cluster mode, on EC2, on Hadoop YARN, on Mesos, or on Kubernetes. Access data in HDFS, Alluxio, Apache Cassandra, Apache HBase, Apache Hive, and hundreds of other data sources. -
32
TiDB Cloud
PingCAP
A cloud-native distributed HTAP database built for elastic scaling and real-time analytics in a fully managed service, with its serverless tier enabling your launching of the HTAP database in seconds. Elastically and transparently scale to hundreds of nodes for critical workloads without changing business logic. Use what you know about SQL, and maintain your relational model and global ACID transactions while coping with your hybrid workloads at ease. Equipped with a built-in high-performance analytics engine to analyze operational data without using an ETL. Scale-out to hundreds of nodes while maintaining ACID transactions. No need to bother with sharding or facing downtime. Ensure data accuracy at scale, even for simultaneous updates to the same data source. Increase productivity and shorten time-to-market for your applications with TiDB’s MySQL compatibility. Easily migrate data from existing MySQL instances without the need to rewrite code.Starting Price: $0.95 per hour -
33
Google Cloud Bigtable
Google
Google Cloud Bigtable is a fully managed, scalable NoSQL database service for large analytical and operational workloads. Fast and performant: Use Cloud Bigtable as the storage engine that grows with you from your first gigabyte to petabyte-scale for low-latency applications as well as high-throughput data processing and analytics. Seamless scaling and replication: Start with a single node per cluster, and seamlessly scale to hundreds of nodes dynamically supporting peak demand. Replication also adds high availability and workload isolation for live serving apps. Simple and integrated: Fully managed service that integrates easily with big data tools like Hadoop, Dataflow, and Dataproc. Plus, support for the open source HBase API standard makes it easy for development teams to get started. -
34
Convex
Convex
Convex is an open source, reactive backend platform that enables developers to build full-stack applications entirely in TypeScript. It offers a document-relational database where queries and mutations are written in TypeScript, ensuring end-to-end type safety and seamless integration with frontend code. Convex's libraries maintain real-time synchronization between the frontend, backend, and database state without the need for manual state management, cache invalidation, or WebSockets. It includes built-in support for cloud functions, scheduling, authentication, file storage, and a variety of components that can be added with a simple npm i command. Developers can define their entire backend, including database schemas, queries, and APIs, in code, which is typechecked and autocompleted, and can be generated by AI with high accuracy. Convex's architecture ensures that all transactions are serializable, providing strong consistency guarantees and eliminating race conditions.Starting Price: $25 per month -
35
Amazon Aurora
Amazon
Amazon Aurora is a MySQL and PostgreSQL-compatible relational database built for the cloud, that combines the performance and availability of traditional enterprise databases with the simplicity and cost-effectiveness of open source databases. Amazon Aurora is up to five times faster than standard MySQL databases and three times faster than standard PostgreSQL databases. It provides the security, availability, and reliability of commercial databases at 1/10th the cost. Amazon Aurora is fully managed by Amazon Relational Database Service (RDS), which automates time-consuming administration tasks like hardware provisioning, database setup, patching, and backups. Amazon Aurora features a distributed, fault-tolerant, self-healing storage system that auto-scales up to 64TB per database instance. It delivers high performance and availability with up to 15 low-latency read replicas, point-in-time recovery, continuous backup to Amazon S3, and replication across three Availability Zones.Starting Price: $0.02 per month -
36
Apache Hive
Apache Software Foundation
The Apache Hive data warehouse software facilitates reading, writing, and managing large datasets residing in distributed storage using SQL. Structure can be projected onto data already in storage. A command line tool and JDBC driver are provided to connect users to Hive. Apache Hive is an open source project run by volunteers at the Apache Software Foundation. Previously it was a subproject of Apache® Hadoop®, but has now graduated to become a top-level project of its own. We encourage you to learn about the project and contribute your expertise. Traditional SQL queries must be implemented in the MapReduce Java API to execute SQL applications and queries over distributed data. Hive provides the necessary SQL abstraction to integrate SQL-like queries (HiveQL) into the underlying Java without the need to implement queries in the low-level Java API. -
37
IBM Db2 Big SQL
IBM
A hybrid SQL-on-Hadoop engine delivering advanced, security-rich data query across enterprise big data sources, including Hadoop, object storage and data warehouses. IBM Db2 Big SQL is an enterprise-grade, hybrid ANSI-compliant SQL-on-Hadoop engine, delivering massively parallel processing (MPP) and advanced data query. Db2 Big SQL offers a single database connection or query for disparate sources such as Hadoop HDFS and WebHDFS, RDMS, NoSQL databases, and object stores. Benefit from low latency, high performance, data security, SQL compatibility, and federation capabilities to do ad hoc and complex queries. Db2 Big SQL is now available in 2 variations. It can be integrated with Cloudera Data Platform, or accessed as a cloud-native service on the IBM Cloud Pak® for Data platform. Access and analyze data and perform queries on batch and real-time data across sources, like Hadoop, object stores and data warehouses. -
38
VeloDB
VeloDB
Powered by Apache Doris, VeloDB is a modern data warehouse for lightning-fast analytics on real-time data at scale. Push-based micro-batch and pull-based streaming data ingestion within seconds. Storage engine with real-time upsert、append and pre-aggregation. Unparalleled performance in both real-time data serving and interactive ad-hoc queries. Not just structured but also semi-structured data. Not just real-time analytics but also batch processing. Not just run queries against internal data but also work as a federate query engine to access external data lakes and databases. Distributed design to support linear scalability. Whether on-premise deployment or cloud service, separation or integration of storage and compute, resource usage can be flexibly and efficiently adjusted according to workload requirements. Built on and fully compatible with open source Apache Doris. Support MySQL protocol, functions, and SQL for easy integration with other data tools. -
39
Baidu Palo
Baidu AI Cloud
Palo helps enterprises to create the PB-level MPP architecture data warehouse service within several minutes and import the massive data from RDS, BOS, and BMR. Thus, Palo can perform the multi-dimensional analytics of big data. Palo is compatible with mainstream BI tools. Data analysts can analyze and display the data visually and gain insights quickly to assist decision-making. It has the industry-leading MPP query engine, with column storage, intelligent index,and vector execution functions. It can also provide in-library analytics, window functions, and other advanced analytics functions. You can create a materialized view and change the table structure without the suspension of service. It supports flexible and efficient data recovery. -
40
PySpark
PySpark
PySpark is an interface for Apache Spark in Python. It not only allows you to write Spark applications using Python APIs, but also provides the PySpark shell for interactively analyzing your data in a distributed environment. PySpark supports most of Spark’s features such as Spark SQL, DataFrame, Streaming, MLlib (Machine Learning) and Spark Core. Spark SQL is a Spark module for structured data processing. It provides a programming abstraction called DataFrame and can also act as distributed SQL query engine. Running on top of Spark, the streaming feature in Apache Spark enables powerful interactive and analytical applications across both streaming and historical data, while inheriting Spark’s ease of use and fault tolerance characteristics. -
41
AIS labPortal
Analytical Information Systems
Perhaps you want to give your clients access to their LIMS data and reports via the web. AIS labPortal allows you to do just that. Paper copies of sample analyses needn’t be sent out in the post to customers. Using their unique login and security password, clients can access data from their computer, which is not only safer and less time-consuming but also more environmentally friendly. labPortal is a web-based portal that securely stores your clients’ sample information and data in the cloud, allowing them to easily access it instantly from their own desktop, tablet or phone. The labPortal interface is 'inbox' style which is simple and easy to use with an enhanced query engine, conditional highlighting and Microsoft Excel export. The software features a simple and easy-to-use sample registration form which allows users to pre-register samples online. Transcribing data is a time-consuming and tedious activity.Starting Price: $200 per month -
42
ScyllaDB
ScyllaDB
ScyllaDB is the database for data-intensive apps that require high performance and low latency. It enables teams to harness the ever-increasing computing power of modern infrastructures – eliminating barriers to scale as data grows. Unlike any other database, ScyllaDB is a distributed NoSQL database fully compatible with Apache Cassandra and Amazon DynamoDB, yet is built with deep architectural advancements that enable exceptional end-user experiences at radically lower costs. Over 400 game-changing companies like Disney+ Hotstar, Expedia, FireEye, Discord, Zillow, Starbucks, Comcast, and Samsung use ScyllaDB for their toughest database challenges. ScyllaDB is available as free open source software, a fully-supported enterprise product, and a fully managed database-as-a-service (DBaaS) on multiple cloud providers. -
43
IBM Db2
IBM
IBM Db2 is a family of data management products, including the Db2 relational database. The products feature AI-powered capabilities to help you modernize the management of both structured and unstructured data across on-premises and multicloud environments. By helping to make your data simple and accessible, the Db2 family positions your business to pursue the value of AI. Most of the Db2 family is available on the IBM Cloud Pak® for Data platform, either as an add-on or an included data source service, making virtually all of your data available across hybrid or multicloud environments to fuel your AI applications. Easily converge your transactional data stores and rapidly derive insights through universal, intelligent querying of data across disparate sources. Cut costs with the multimodel capability that eliminates the need for data replication and migration. Enhance agility by running Db2 on any cloud vendor. -
44
ClusterEngine
Aqua Networks
Monitor all resources of any Linux and Windows-based Cloud or Dedicated server. Monitor and URL, including SSL expiry dates and get alerts if something happens. Backup your servers to Amazon S3 or Local storage all via CloudStats. View exactly which processes are consuming resources on your server. Add your System Administrators and Co-Founders with correct permissions to be able to view the stats. Configure Alerts to suit your needs and send them to the correct person on your team. CloudStats is a Website and Server Monitoring platform capable of monitoring Linux and Windows-based servers. CloudStats works by installing an Agent on your server which collects and sends data to the monitoring platform every minute. The Agent uses a secure SSL connection to send all data safely to the monitoring system. You need to open Ports 443 and 80 for agent connections. Port 443 is used for data transmission and port 80 is used for Pings and Keepalive requests. -
45
Couchbase
Couchbase
Unlike other NoSQL databases, Couchbase provides an enterprise-class, multicloud to edge database that offers the robust capabilities required for business-critical applications on a highly scalable and available platform. As a distributed cloud-native database, Couchbase runs in modern dynamic environments and on any cloud, either customer-managed or fully managed as-a-service. Couchbase is built on open standards, combining the best of NoSQL with the power and familiarity of SQL, to simplify the transition from mainframe and relational databases. Couchbase Server is a multipurpose, distributed database that fuses the strengths of relational databases such as SQL and ACID transactions with JSON’s versatility, with a foundation that is extremely fast and scalable. It’s used across industries for things like user profiles, dynamic product catalogs, GenAI apps, vector search, high-speed caching, and much more. -
46
Yugabyte
Yugabyte
The Leading High-Performance Distributed SQL Database. Open source, cloud native relational DB for powering global, internet-scale apps. Single-Digit Millisecond Latency Build blazing fast cloud applications by serving queries directly from the DB. Massive Scale. Achieve millions of transactions per second and store multiple TB’s of data per node. Geo-Distribution. Deploy across regions and clouds with synchronous or multi-master replication. Built for Cloud Native Architectures. Develop, deploy and operationalize modern applications faster than ever before with YugabyteDB. Gain Developer Agility. Leverage full power of PostgreSQL-compatible SQL and distributed ACID transactions. Operate Resilient Services. Ensure continuous availability even when underlying compute, storage or network fails. Scale On-Demand. Add and remove nodes at will. Say no to over-provisioned clusters forever. Lower User Latency. -
47
Aiven
Aiven
Aiven manages your open source data infrastructure in the cloud - so you don't have to. Developers can do what they do best: create applications. We do what we do best: manage cloud data infrastructure. All solutions are open source. You can also freely move data between clouds or create multi-cloud environments. Know exactly how much you’ll be paying and why. We bundle networking, storage and basic support costs together. We are committed to keeping your Aiven software online. If there’s ever an issue, we’ll be there to fix it. Deploy a service on the Aiven platform in 10 minutes. Sign up - no credit card info needed. Select your open source service, and the cloud and region to deploy to. Choose your plan - you have $300 in free credits. Click "Create service" and go on to configure your data sources. Stay in control of your data using powerful open-source services.Starting Price: $200.00 per month -
48
Xano
Xano
Xano provides a fully-managed scaleable infrastructure to power your backend. On top of that security, you can quickly build the business logic that powers your backend without a single line of code or use one of our pre-made templates to launch quickly without sacrificing scale or security. Build custom API endpoints without a single line of code. Accelerate time to market using our out-of-the-box CRUD operations and Marketplace extensions and templates! Your API comes “ready-to-use” so you can immediately connect to any frontend and focus on your business logic. Everything is also automatically documented in Swagger so connecting to a frontend is a breeze. Xano uses PostgreSQL which provides the flexibility of a relational database along with the Big data needs of a NoSQL solution. Add features to your backend in a few clicks or start with ready-made templates and extensions to jumpstart your project.Starting Price: $29 per month -
49
InfluxDB
InfluxData
InfluxDB is a purpose-built data platform designed to handle all time series data, from users, sensors, applications and infrastructure — seamlessly collecting, storing, visualizing, and turning insight into action. With a library of more than 250 open source Telegraf plugins, importing and monitoring data from any system is easy. InfluxDB empowers developers to build transformative IoT, monitoring and analytics services and applications. InfluxDB’s flexible architecture fits any implementation — whether in the cloud, at the edge or on-premises — and its versatility, accessibility and supporting tools (client libraries, APIs, etc.) make it easy for developers at any level to quickly build applications and services with time series data. Optimized for developer efficiency and productivity, the InfluxDB platform gives builders time to focus on the features and functionalities that give their internal projects value and their applications a competitive edge.Starting Price: $0 -
50
Directus
Monospace Inc
Directus is an Open Data Platform for managing the content of any SQL database. It provides a powerful API layer for developers and an intuitive App for non-technical users. Written entirely in JavaScript (primarily Node.js and Vue.js), Directus is completely open-source, modular, and extensible, allowing it to be fully tailored to your exact project needs. With Directus Cloud, we've taken our open-source spirit to the cloud by offering Directus Community Cloud, a completely free tier - without quotas or limitations. It's ideal for hobby projects and demos, and when you're ready for production, Standard Cloud has the power and infrastructure options you need. Our pricing is usage-based, which means you only pay for what you use. Use Directus on your next headless cms, internal tool, or SaaS project, or use our Insights for better data management and analytics. With Directus, you're only limited by your imagination.Starting Price: Free