Audience
Data Warehouse solution for organizations
About Apache Flume
Flume is a distributed, reliable, and available service for efficiently collecting, aggregating, and moving large amounts of log data. It has a simple and flexible architecture based on streaming data flows. It is robust and fault-tolerant with tunable reliability mechanisms and many failovers and recovery mechanisms. It uses a simple extensible data model that allows for online analytic applications. The Apache Flume team is pleased to announce the release of Flume 1.8.0. Flume is a distributed, reliable, and available service for efficiently collecting, aggregating, and moving large amounts of streaming event data.
Other Popular Alternatives & Related Software
VeloDB
Powered by Apache Doris, VeloDB is a modern data warehouse for lightning-fast analytics on real-time data at scale.
Push-based micro-batch and pull-based streaming data ingestion within seconds. Storage engine with real-time upsert、append and pre-aggregation. Unparalleled performance in both real-time data serving and interactive ad-hoc queries.
Not just structured but also semi-structured data. Not just real-time analytics but also batch processing. Not just run queries against internal data but also work as a federate query engine to access external data lakes and databases.
Distributed design to support linear scalability. Whether on-premise deployment or cloud service, separation or integration of storage and compute, resource usage can be flexibly and efficiently adjusted according to workload requirements.
Built on and fully compatible with open source Apache Doris. Support MySQL protocol, functions, and SQL for easy integration with other data tools.
Learn more
Striim
Data integration for your hybrid cloud. Modern, reliable data integration across your private and public cloud. All in real-time with change data capture and data streams. Built by the executive & technical team from GoldenGate Software, Striim brings decades of experience in mission-critical enterprise workloads. Striim scales out as a distributed platform in your environment or in the cloud. Scalability is fully configurable by your team. Striim is fully secure with HIPAA and GDPR compliance. Built ground up for modern enterprise workloads in the cloud or on-premise. Drag and drop to create data flows between your sources and targets. Process, enrich, and analyze your streaming data with real-time SQL queries.
Learn more
Hadoop
The Apache Hadoop software library is a framework that allows for the distributed processing of large data sets across clusters of computers using simple programming models. It is designed to scale up from single servers to thousands of machines, each offering local computation and storage. Rather than rely on hardware to deliver high-availability, the library itself is designed to detect and handle failures at the application layer, so delivering a highly-available service on top of a cluster of computers, each of which may be prone to failures. A wide variety of companies and organizations use Hadoop for both research and production. Users are encouraged to add themselves to the Hadoop PoweredBy wiki page. Apache Hadoop 3.3.4 incorporates a number of significant enhancements over the previous major release line (hadoop-3.2).
Learn more
Apache Doris
Apache Doris is a modern data warehouse for real-time analytics. It delivers lightning-fast analytics on real-time data at scale.
Push-based micro-batch and pull-based streaming data ingestion within a second. Storage engine with real-time upsert, append and pre-aggregation.
Optimize for high-concurrency and high-throughput queries with columnar storage engine, MPP architecture, cost based query optimizer, vectorized execution engine.
Federated querying of data lakes such as Hive, Iceberg and Hudi, and databases such as MySQL and PostgreSQL.
Compound data types such as Array, Map and JSON. Variant data type to support auto data type inference of JSON data. NGram bloomfilter and inverted index for text searches.
Distributed design for linear scalability. Workload isolation and tiered storage for efficient resource management. Supports shared-nothing clusters as well as separation of storage and compute.
Learn more
Integrations
Company Information
Apache Software Foundation
Founded: 1999
United States
flume.apache.org
Other Useful Business Software
Full-stack observability with actually useful AI | Grafana Cloud
Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.
Product Details
Platforms Supported
Cloud
Training
Documentation
Support
Online