Delta Lake
Delta Lake is an open-source storage layer that brings ACID transactions to Apache Spark™ and big data workloads. Data lakes typically have multiple data pipelines reading and writing data concurrently, and data engineers have to go through a tedious process to ensure data integrity, due to the lack of transactions. Delta Lake brings ACID transactions to your data lakes. It provides serializability, the strongest level of isolation level. Learn more at Diving into Delta Lake: Unpacking the Transaction Log. In big data, even the metadata itself can be "big data". Delta Lake treats metadata just like data, leveraging Spark's distributed processing power to handle all its metadata. As a result, Delta Lake can handle petabyte-scale tables with billions of partitions and files at ease. Delta Lake provides snapshots of data enabling developers to access and revert to earlier versions of data for audits, rollbacks or to reproduce experiments.
Learn more
BigLake
BigLake is a storage engine that unifies data warehouses and lakes by enabling BigQuery and open-source frameworks like Spark to access data with fine-grained access control. BigLake provides accelerated query performance across multi-cloud storage and open formats such as Apache Iceberg. Store a single copy of data with uniform features across data warehouses & lakes. Fine-grained access control and multi-cloud governance over distributed data. Seamless integration with open-source analytics tools and open data formats. Unlock analytics on distributed data regardless of where and how it’s stored, while choosing the best analytics tools, open source or cloud-native over a single copy of data. Fine-grained access control across open source engines like Apache Spark, Presto, and Trino, and open formats such as Parquet. Performant queries over data lakes powered by BigQuery. Integrates with Dataplex to provide management at scale, including logical data organization.
Learn more
Azure Blob Storage
Massively scalable and secure object storage for cloud-native workloads, archives, data lakes, high-performance computing, and machine learning. Azure Blob Storage helps you create data lakes for your analytics needs, and provides storage to build powerful cloud-native and mobile apps. Optimize costs with tiered storage for your long-term data, and flexibly scale up for high-performance computing and machine learning workloads. Blob storage is built from the ground up to support the scale, security, and availability needs of mobile, web, and cloud-native application developers. Use it as a cornerstone for serverless architectures such as Azure Functions. Blob storage supports the most popular development frameworks, including Java, .NET, Python, and Node.js, and is the only cloud storage service that offers a premium, SSD-based object storage tier for low-latency and interactive scenarios.
Learn more
Cribl Search
Cribl Search delivers next-generation search-in-place technology, empowering users to explore, discover, and analyze data that was previously impossible – directly at its source, across any cloud, even data locked behind APIs. Effortlessly search your Cribl Lake or sift through data in major object stores like AWS S3, Amazon Security Lake, Azure Blob, and Google Cloud Storage, and enrich your insights by querying dozens of live API endpoints from various SaaS providers. The power of Cribl Search lies in its strategic approach: forward only the critical data to your systems of analysis, thus avoiding the cost of expensive storage. With native support for platforms such as Amazon Security Lake, AWS S3, Azure Blob, and Google Cloud Storage, Cribl Search delivers a first-of-its-kind ability to seamlessly analyze all data right at its source. Cribl Search allows users to search and analyze data wherever it is located, from debug logs at the edge to archived data in cold storage.
Learn more