Alternatives to WANdisco

Compare WANdisco alternatives for your business or organization using the curated list below. SourceForge ranks the best alternatives to WANdisco in 2026. Compare features, ratings, user reviews, pricing, and more from WANdisco competitors and alternatives in order to make an informed decision for your business.

  • 1
    Apache Ranger

    Apache Ranger

    The Apache Software Foundation

    Apache Ranger™ is a framework to enable, monitor and manage comprehensive data security across the Hadoop platform. The vision with Ranger is to provide comprehensive security across the Apache Hadoop ecosystem. With the advent of Apache YARN, the Hadoop platform can now support a true data lake architecture. Enterprises can potentially run multiple workloads, in a multi tenant environment. Data security within Hadoop needs to evolve to support multiple use cases for data access, while also providing a framework for central administration of security policies and monitoring of user access. Centralized security administration to manage all security related tasks in a central UI or using REST APIs. Fine grained authorization to do a specific action and/or operation with Hadoop component/tool and managed through a central administration tool. Standardize authorization method across all Hadoop components. Enhanced support for different authorization methods - Role based access control etc.
  • 2
    Oracle Big Data Service
    Oracle Big Data Service makes it easy for customers to deploy Hadoop clusters of all sizes, with VM shapes ranging from 1 OCPU to a dedicated bare metal environment. Customers choose between high-performance NVmE storage or cost-effective block storage, and can grow or shrink their clusters. Quickly create Hadoop-based data lakes to extend or complement customer data warehouses, and ensure that all data is both accessible and managed cost-effectively. Query, visualize and transform data so data scientists can build machine learning models using the included notebook with its R, Python and SQL support. Move customer-managed Hadoop clusters to a fully-managed cloud-based service, reducing management costs and improving resource utilization.
    Starting Price: $0.1344 per hour
  • 3
    SAS Data Loader for Hadoop
    Load your data into or out of Hadoop and data lakes. Prep it so it's ready for reports, visualizations or advanced analytics – all inside the data lakes. And do it all yourself, quickly and easily. Makes it easy to access, transform and manage data stored in Hadoop or data lakes with a web-based interface that reduces training requirements. Built from the ground up to manage big data on Hadoop or in data lakes; not repurposed from existing IT-focused tools. Lets you group multiple directives to run simultaneously or one after the other. Schedule and automate directives using the exposed Public API. Enables you to share and secure directives. Call them from SAS Data Integration Studio, uniting technical and nontechnical user activities. Includes built-in directives – casing, gender and pattern analysis, field extraction, match-merge and cluster-survive. Profiling runs in-parallel on the Hadoop cluster for better performance.
  • 4
    K2View

    K2View

    K2View

    At K2View, we believe that every enterprise should be able to leverage its data to become as disruptive and agile as the best companies in its industry. We make this possible through our patented Data Product Platform, which creates and manages a complete and compliant dataset for every business entity – on demand, and in real time. The dataset is always in sync with its underlying sources, adapts to changes in the source structures, and is instantly accessible to any authorized data consumer. Data Product Platform fuels many operational use cases, including customer 360, data masking and tokenization, test data management, data migration, legacy application modernization, data pipelining and more – to deliver business outcomes in less than half the time, and at half the cost, of any other alternative. The platform inherently supports modern data architectures – data mesh, data fabric, and data hub – and deploys in cloud, on-premise, or hybrid environments.
  • 5
    Apache Trafodion

    Apache Trafodion

    Apache Software Foundation

    Apache Trafodion is a webscale SQL-on-Hadoop solution enabling transactional or operational workloads on Apache Hadoop. Trafodion builds on the scalability, elasticity, and flexibility of Hadoop. Trafodion extends Hadoop to provide guaranteed transactional integrity, enabling new kinds of big data applications to run on Hadoop. Full-functioned ANSI SQL language support. JDBC/ODBC connectivity for Linux/Windows clients. Distributed ACID transaction protection across multiple statements, tables, and rows. Performance improvements for OLTP workloads with compile-time and run-time optimizations. Support for large data sets using a parallel-aware query optimizer. Reuse existing SQL skills and improve developer productivity. Distributed ACID transactions guarantee data consistency across multiple rows and tables. Interoperability with existing tools and applications. Hadoop and Linux distribution neutral. Easy to add to your existing Hadoop infrastructure.
    Starting Price: Free
  • 6
    Oracle Big Data SQL Cloud Service
    Oracle Big Data SQL Cloud Service enables organizations to immediately analyze data across Apache Hadoop, NoSQL and Oracle Database leveraging their existing SQL skills, security policies and applications with extreme performance. From simplifying data science efforts to unlocking data lakes, Big Data SQL makes the benefits of Big Data available to the largest group of end users possible. Big Data SQL gives users a single location to catalog and secure data in Hadoop and NoSQL systems, Oracle Database. Seamless metadata integration and queries which join data from Oracle Database with data from Hadoop and NoSQL databases. Utilities and conversion routines support automatic mappings from metadata stored in HCatalog (or the Hive Metastore) to Oracle Tables. Enhanced access parameters give administrators the flexibility to control column mapping and data access behavior. Multiple cluster support enables one Oracle Database to query multiple Hadoop clusters and/or NoSQL systems.
  • 7
    Oracle Big Data Discovery
    Oracle Big Data Discovery is a stunningly visual, intuitive product that leverages the power of Hadoop to transform raw data into business insight in minutes, without the need to learn complex tools or rely only on highly specialized resources. With Oracle Big Data Discovery, customers can easily find relevant data sets in Hadoop, explore the data and quickly understand its potential, transform and enrich data to make it better, analyze the data to discover new insights, share results and publish back to Hadoop for use across the enterprise. In your organization, use BDD as the center of your data lab, as a unified environment for navigating and exploring all of your data sources in Hadoop, and to create projects and BDD applications. In BDD, a wider number of people can work with big data, compared with traditional analytics tools. You spend less time on data loading and updates, and can focus on actual data analysis of big data.
  • 8
    Adoki

    Adoki

    Adastra

    Adoki streamlines data transfers to and from any platform or system—whether it's a data warehouse, database, cloud service, Hadoop platform, or streaming application—on both one-time and recurring schedules. It adapts to your IT infrastructure's workload, adjusting transfer or replication processes to optimal times when needed. With centralized management and monitoring of data transfers, Adoki allows you to handle your data operations with a smaller, more efficient team.
  • 9
    IBM Db2 Big SQL
    A hybrid SQL-on-Hadoop engine delivering advanced, security-rich data query across enterprise big data sources, including Hadoop, object storage and data warehouses. IBM Db2 Big SQL is an enterprise-grade, hybrid ANSI-compliant SQL-on-Hadoop engine, delivering massively parallel processing (MPP) and advanced data query. Db2 Big SQL offers a single database connection or query for disparate sources such as Hadoop HDFS and WebHDFS, RDMS, NoSQL databases, and object stores. Benefit from low latency, high performance, data security, SQL compatibility, and federation capabilities to do ad hoc and complex queries. Db2 Big SQL is now available in 2 variations. It can be integrated with Cloudera Data Platform, or accessed as a cloud-native service on the IBM Cloud Pak® for Data platform. Access and analyze data and perform queries on batch and real-time data across sources, like Hadoop, object stores and data warehouses.
  • 10
    Apache Sentry

    Apache Sentry

    Apache Software Foundation

    Apache Sentry™ is a system for enforcing fine grained role based authorization to data and metadata stored on a Hadoop cluster. Apache Sentry has successfully graduated from the Incubator in March of 2016 and is now a Top-Level Apache project. Apache Sentry is a granular, role-based authorization module for Hadoop. Sentry provides the ability to control and enforce precise levels of privileges on data for authenticated users and applications on a Hadoop cluster. Sentry currently works out of the box with Apache Hive, Hive Metastore/HCatalog, Apache Solr, Impala and HDFS (limited to Hive table data). Sentry is designed to be a pluggable authorization engine for Hadoop components. It allows you to define authorization rules to validate a user or application’s access requests for Hadoop resources. Sentry is highly modular and can support authorization for a wide variety of data models in Hadoop.
  • 11
    ZetaAnalytics

    ZetaAnalytics

    Halliburton

    The ZetaAnalytics product requires a compatible database appliance for its Data Warehouse. Landmark has qualified the ZetaAnalytics software using Teradata, EMC Greenplum, and IBM Netezza. Please see the ZetaAnalytics Release Notes for the most up to date qualified versions. Before installing and configuring ZetaAnalytics software, ensure that the Data Warehouse you use for drilling data is created and running. Scripts to create the various Zeta-specific database components within the Data Warehouse will need to be run as part of the installation process. These require database administrator (DBA) rights. The ZetaAnalytics product requires Apache Hadoop for model scoring and real-time streaming. If you do not already have an Apache Hadoop cluster installed in your environment, please install it before running the ZetaAnalytics installer, which will prompt you for the name and port number of your Hadoop Name Server and Map Reducer.
  • 12
    CONNX

    CONNX

    Software AG

    Unlock the value of your data—wherever it resides. To become data-driven, you need to leverage all the information in your enterprise across apps, clouds and systems. With the CONNX data integration solution, you can easily access, virtualize and move your data—wherever it is, however it’s structured—without changing your core systems. Get your information where it needs to be to better serve your organization, customers, partners and suppliers. Connect and transform legacy data sources from transactional databases to big data or data warehouses such as Hadoop®, AWS and Azure®. Or move legacy to the cloud for scalability, such as MySQL to Microsoft® Azure® SQL Database, SQL Server® to Amazon REDSHIFT®, or OpenVMS® Rdb to Teradata®.
  • 13
    SAS Data Management

    SAS Data Management

    SAS Institute

    No matter where your data is stored, from cloud, to legacy systems, to data lakes, like Hadoop, SAS Data Management helps you access the data you need. Create data management rules once and reuse them, giving you a standard, repeatable method for improving and integrating data, without additional cost. As an IT expert, it's easy to get entangled in tasks outside your normal duties. SAS Data Management enables your business users to update data, tweak processes and analyze results themselves, freeing you up for other projects. Plus, a built-in business glossary, as well as SAS and third-party metadata management and lineage visualization capabilities, keep everyone on the same page. SAS Data Management technology is truly integrated, which means you’re not forced to work with a solution that’s been cobbled together. All our components, from data quality to data federation technology, are part of the same architecture.
  • 14
    Apache Bigtop

    Apache Bigtop

    Apache Software Foundation

    Bigtop is an Apache Foundation project for Infrastructure Engineers and Data Scientists looking for comprehensive packaging, testing, and configuration of the leading open source big data components. Bigtop supports a wide range of components/projects, including, but not limited to, Hadoop, HBase and Spark. Bigtop packages Hadoop RPMs and DEBs, so that you can manage and maintain your Hadoop cluster. Bigtop provides an integrated smoke testing framework, alongside a suite of over 50 test files. Bigtop provides vagrant recipes, raw images, and (work-in-progress) docker recipes for deploying Hadoop from zero. Bigtop support many Operating Systems, including Debian, Ubuntu, CentOS, Fedora, openSUSE and many others. Bigtop includes tools and a framework for testing at various levels (packaging, platform, runtime, etc.) for both initial deployments as well as upgrade scenarios for the entire data platform, not just the individual components.
  • 15
    Apache Impala
    Impala provides low latency and high concurrency for BI/analytic queries on the Hadoop ecosystem, including Iceberg, open data formats, and most cloud storage options. Impala also scales linearly, even in multitenant environments. Impala is integrated with native Hadoop security and Kerberos for authentication, and via the Ranger module, you can ensure that the right users and applications are authorized for the right data. Utilize the same file and data formats and metadata, security, and resource management frameworks as your Hadoop deployment, with no redundant infrastructure or data conversion/duplication. For Apache Hive users, Impala utilizes the same metadata and ODBC driver. Like Hive, Impala supports SQL, so you don't have to worry about reinventing the implementation wheel. With Impala, more users, whether using SQL queries or BI applications, can interact with more data through a single repository and metadata stored from source through analysis.
    Starting Price: Free
  • 16
    ArcServe Live Migration
    Migrate data, applications and workloads to the cloud without downtime. Arcserve Live Migration was designed to eliminate disruption during your cloud transformation. Easily move data, applications and workloads to the cloud or target destination of your choice while keeping your business fully up and running. Remove complexity by orchestrating the cutover to the target destination. Manage the entire cloud migration process from a centralconsole. Arcserve Live Migration simplifies the process of migrating data, applications and workloads. Its highly flexible architecture enables you to move virtually any type of data or workload to cloud, on-premises or remote locations, such as the edge, with support for virtual, cloud and physical systems. Arcserve Live Migration automatically synchronizes files, databases, and applications on Windows and Linux systems with a second physical or virtual environment located on-premises, at a remote location, or in the cloud.
  • 17
    Hadoop

    Hadoop

    Apache Software Foundation

    The Apache Hadoop software library is a framework that allows for the distributed processing of large data sets across clusters of computers using simple programming models. It is designed to scale up from single servers to thousands of machines, each offering local computation and storage. Rather than rely on hardware to deliver high-availability, the library itself is designed to detect and handle failures at the application layer, so delivering a highly-available service on top of a cluster of computers, each of which may be prone to failures. A wide variety of companies and organizations use Hadoop for both research and production. Users are encouraged to add themselves to the Hadoop PoweredBy wiki page. Apache Hadoop 3.3.4 incorporates a number of significant enhancements over the previous major release line (hadoop-3.2).
  • 18
    Apache Atlas

    Apache Atlas

    Apache Software Foundation

    Atlas is a scalable and extensible set of core foundational governance services – enabling enterprises to effectively and efficiently meet their compliance requirements within Hadoop and allows integration with the whole enterprise data ecosystem. Apache Atlas provides open metadata management and governance capabilities for organizations to build a catalog of their data assets, classify and govern these assets and provide collaboration capabilities around these data assets for data scientists, analysts and the data governance team. Pre-defined types for various Hadoop and non-Hadoop metadata. Ability to define new types for the metadata to be managed. Types can have primitive attributes, complex attributes, object references; can inherit from other types. Instances of types, called entities, capture metadata object details and their relationships. REST APIs to work with types and instances allow easier integration.
  • 19
    Oracle Enterprise Metadata Management
    Oracle Enterprise Metadata Management (OEMM) is a comprehensive metadata management platform. OEMM can harvest and catalog metadata from virtually any metadata provider, including relational, Hadoop, ETL, BI, data modeling, and many more. OEMM however is not just a metadata repository, OEMM allows for interactive searching and browsing of the metadata as well as providing data lineage, impact analysis, semantic definition and semantic usage analysis for any metadata asset within the catalog. OEMM's advanced algorithms stitch together metadata from each of the providers providing the complete path of data from source to report or vice versa. OEMM supports virtually any metadata provider including: Data modeling tools, databases, CASE tools, Hadoop, ETL engines, Warehouses, BI, EAI environments, as well as many more.
  • 20
    PeerSync Migration

    PeerSync Migration

    Peer Software

    PeerSync™ Migration eases data migration related challenges for mixed storage environments thanks to key features like API integration and a real-time data replication engine that is proven in thousands of customer implementations. PeerFSA is a lightweight tool that provides valuable, detailed information about the structure, organization and usage of file data in complex environments. It is designed to improve migration performance and efficiency. Generate jobs automatically by importing source/target pairs. The graphics below illustrate examples of use cases for PeerSync Migration. The key advantage is real-time integration with all major storage platforms to eliminate final scans for non-disruptive migrations. PeerSync Migration offers the flexibility of efficient migration to the cloud or consolidating file servers in on-premises or cloud data centers.
  • 21
    Cloud Migrator

    Cloud Migrator

    Prosperoware

    Consolidate & Migrate in One Step! Migrate On-Premises DMS Systems & File Shares to iManage Cloud or On-Premises. Cloud Migrator uses the industry standard ETL Design (‘Extract’, ‘Transform‘ and ‘Load’) for an efficient solution to cloud migration. It offers the ability to consolidate databases, map metadata, and directly migrate content to iManage Cloud. It also enables data clean-up for on-premises. It can migrate content from eDocs, iManage (on-premises), Windows File Shares, or any other structured database system. Fastest migration performance limited only by your hardware, provider, and ISP. Consolidate databases many-to-one or many-to-many. Remap fields during the migration process. Clean-up documents from flat spaces into workspaces and folders. Clean-up and modify data - before moving - via staging tables. Option to map, modify, and migrate existing metadata fields via provider’s REST API.
  • 22
    Azure HDInsight
    Run popular open-source frameworks—including Apache Hadoop, Spark, Hive, Kafka, and more—using Azure HDInsight, a customizable, enterprise-grade service for open-source analytics. Effortlessly process massive amounts of data and get all the benefits of the broad open-source project ecosystem with the global scale of Azure. Easily migrate your big data workloads and processing to the cloud. Open-source projects and clusters are easy to spin up quickly without the need to install hardware or manage infrastructure. Big data clusters reduce costs through autoscaling and pricing tiers that allow you to pay for only what you use. Enterprise-grade security and industry-leading compliance with more than 30 certifications helps protect your data. Optimized components for open-source technologies such as Hadoop and Spark keep you up to date.
  • 23
    Sesame Software

    Sesame Software

    Sesame Software

    Sesame Software specializes in secure, efficient data integration and replication across diverse cloud, hybrid, and on-premise sources. Our patented scalability ensures comprehensive access to critical business data, facilitating a holistic view in the BI tools of your choice. This unified perspective empowers your own robust reporting and analytics, enabling your organization to regain control of your data with confidence. At Sesame Software, we understand what’s at stake when you need to move a massive amount of data between environments quickly—while keeping it protected, maintaining centralized access, and ensuring compliance with regulations. Over the past 30+ years, we’ve helped hundreds of organizations like Proctor & Gamble, Bank of America, and the U.S. government connect, move, store, and protect their data.
  • 24
    QuerySurge
    QuerySurge leverages AI to automate the data validation and ETL testing of Big Data, Data Warehouses, Business Intelligence Reports and Enterprise Apps/ERPs with full DevOps functionality for continuous testing. Use Cases - Data Warehouse & ETL Testing - Hadoop & NoSQL Testing - DevOps for Data / Continuous Testing - Data Migration Testing - BI Report Testing - Enterprise App/ERP Testing QuerySurge Features - Projects: Multi-project support - AI: automatically create datas validation tests based on data mappings - Smart Query Wizards: Create tests visually, without writing SQL - Data Quality at Speed: Automate the launch, execution, comparison & see results quickly - Test across 200+ platforms: Data Warehouses, Hadoop & NoSQL lakes, databases, flat files, XML, JSON, BI Reports - DevOps for Data & Continuous Testing: RESTful API with 60+ calls & integration with all mainstream solutions - Data Analytics & Data Intelligence:  Analytics dashboard & reports
  • 25
    Apache Kylin

    Apache Kylin

    Apache Software Foundation

    Apache Kylin™ is an open source, distributed Analytical Data Warehouse for Big Data; it was designed to provide OLAP (Online Analytical Processing) capability in the big data era. By renovating the multi-dimensional cube and precalculation technology on Hadoop and Spark, Kylin is able to achieve near constant query speed regardless of the ever-growing data volume. Reducing query latency from minutes to sub-second, Kylin brings online analytics back to big data. Kylin can analyze 10+ billions of rows in less than a second. No more waiting on reports for critical decisions. Kylin connects data on Hadoop to BI tools like Tableau, PowerBI/Excel, MSTR, QlikSense, Hue and SuperSet, making the BI on Hadoop faster than ever. As an Analytical Data Warehouse, Kylin offers ANSI SQL on Hadoop/Spark and supports most ANSI SQL query functions. Kylin can support thousands of interactive queries at the same time, thanks to the low resource consumption of each query.
  • 26
    E-MapReduce
    EMR is an all-in-one enterprise-ready big data platform that provides cluster, job, and data management services based on open-source ecosystems, such as Hadoop, Spark, Kafka, Flink, and Storm. Alibaba Cloud Elastic MapReduce (EMR) is a big data processing solution that runs on the Alibaba Cloud platform. EMR is built on Alibaba Cloud ECS instances and is based on open-source Apache Hadoop and Apache Spark. EMR allows you to use the Hadoop and Spark ecosystem components, such as Apache Hive, Apache Kafka, Flink, Druid, and TensorFlow, to analyze and process data. You can use EMR to process data stored on different Alibaba Cloud data storage service, such as Object Storage Service (OSS), Log Service (SLS), and Relational Database Service (RDS). You can quickly create clusters without the need to configure hardware and software. All maintenance operations are completed on its Web interface.
  • 27
    IBM Spectrum Virtualize
    IBM Spectrum Virtualize™ and IBM Spectrum Virtualize™ for Public Cloud together support mirroring between on-premises and cloud data centers or between cloud data centers. Migrate data between on-premises and public cloud data centers or between public cloud data centers. Enjoy consistent data management between on-premises storage and the public cloud. Working together with on-premises software, you can replicate or migrate data from any of over 500 supported storage systems so you can add hybrid cloud capability without major new investment. Pay for only the storage capacity you manage on the public cloud, with flexible software monthly pricing available. Implement disaster recovery strategies between on-premises and public cloud data centers. Enable cloud-based DevOps with easy replication of data from on-premises sources.
  • 28
    Huawei Cloud Data Migration
    On-premises and cloud-based data migrations among nearly 20 types of data sources are supported. The distributed computing framework ensures high-performance data migration and optimal data writing of specific data sources. The wizard-based development interface frees you from complex programming and helps you quickly develop migration tasks. You only pay for what you use and do not need to build dedicated hardware and software. Big data cloud services can replace or back up on-premises big data platforms and support full migration of massive amounts of data. Support for relational databases, big data, files, NoSQL, and many other data sources ensures a wide application scope. Wizard-based task management provides out-of-the-box usability. Data is migrated between services on HUAWEI CLOUD, achieving data mobility.
    Starting Price: $0.56 per hour
  • 29
    Apache Phoenix

    Apache Phoenix

    Apache Software Foundation

    Apache Phoenix enables OLTP and operational analytics in Hadoop for low-latency applications by combining the best of both worlds. The power of standard SQL and JDBC APIs with full ACID transaction capabilities and the flexibility of late-bound, schema-on-read capabilities from the NoSQL world by leveraging HBase as its backing store. Apache Phoenix is fully integrated with other Hadoop products such as Spark, Hive, Pig, Flume, and Map Reduce. Become the trusted data platform for OLTP and operational analytics for Hadoop through well-defined, industry-standard APIs. Apache Phoenix takes your SQL query, compiles it into a series of HBase scans, and orchestrates the running of those scans to produce regular JDBC result sets. Direct use of the HBase API, along with coprocessors and custom filters, results in performance on the order of milliseconds for small queries, or seconds for tens of millions of rows.
    Starting Price: Free
  • 30
    AWS DataSync
    AWS DataSync is a secure, online service that automates and accelerates moving data between on-premises storage and AWS Storage services. It simplifies migration planning and reduces expensive on-premises data movement costs with a fully managed service that seamlessly scales as data loads increase. DataSync can copy data between Network File System (NFS) shares, Server Message Block (SMB) shares, Hadoop Distributed File Systems (HDFS), self-managed object storage, AWS Snowcone, Amazon Simple Storage Service (Amazon S3) buckets, Amazon Elastic File System (Amazon EFS) file systems, Amazon FSx for Windows File Server file systems, Amazon FSx for Lustre file systems, Amazon FSx for OpenZFS file systems, and Amazon FSx for NetApp ONTAP file systems. It also supports moving data between other public clouds and AWS Storage services, enabling replication, archival, or sharing of application data easily. DataSync provides end-to-end security, including data encryption and data integrity.
  • 31
    IBM Analytics Engine
    IBM Analytics Engine provides an architecture for Hadoop clusters that decouples the compute and storage tiers. Instead of a permanent cluster formed of dual-purpose nodes, the Analytics Engine allows users to store data in an object storage layer such as IBM Cloud Object Storage and spins up clusters of computing notes when needed. Separating compute from storage helps to transform the flexibility, scalability and maintainability of big data analytics platforms. Build on an ODPi compliant stack with pioneering data science tools with the broader Apache Hadoop and Apache Spark ecosystem. Define clusters based on your application's requirements. Choose the appropriate software pack, version, and size of the cluster. Use as long as required and delete as soon as an application finishes jobs. Configure clusters with third-party analytics libraries and packages. Deploy workloads from IBM Cloud services like machine learning.
    Starting Price: $0.014 per hour
  • 32
    Alibaba Cloud Data Integration
    Alibaba Cloud Data Integration is a comprehensive data synchronization platform that facilitates both real-time and offline data exchange across various data sources, networks, and locations. It supports data synchronization between more than 400 pairs of disparate data sources, including RDS databases, semi-structured storage, non-structured storage (such as audio, video, and images), NoSQL databases, and big data storage. The platform also enables real-time data reading and writing between data sources such as Oracle, MySQL, and DataHub. Data Integration allows users to schedule offline tasks by setting specific trigger times, including year, month, day, hour, and minute, simplifying the configuration of periodic incremental data extraction. It integrates seamlessly with DataWorks data modeling, providing an operations and maintenance integrated workflow. The platform leverages the computing capability of Hadoop clusters to synchronize HDFS data to MaxCompute.
  • 33
    Apache Drill

    Apache Drill

    The Apache Software Foundation

    Schema-free SQL Query Engine for Hadoop, NoSQL and Cloud Storage
  • 34
    Apache Parquet

    Apache Parquet

    The Apache Software Foundation

    We created Parquet to make the advantages of compressed, efficient columnar data representation available to any project in the Hadoop ecosystem. Parquet is built from the ground up with complex nested data structures in mind, and uses the record shredding and assembly algorithm described in the Dremel paper. We believe this approach is superior to simple flattening of nested namespaces. Parquet is built to support very efficient compression and encoding schemes. Multiple projects have demonstrated the performance impact of applying the right compression and encoding scheme to the data. Parquet allows compression schemes to be specified on a per-column level, and is future-proofed to allow adding more encodings as they are invented and implemented. Parquet is built to be used by anyone. The Hadoop ecosystem is rich with data processing frameworks, and we are not interested in playing favorites.
  • 35
    Quest On Demand Migration
    Quest On Demand Migration is a cloud-based solution designed to streamline and simplify the process of migrating workloads, including email, files, and user data, to the cloud. It helps organizations migrate from on-premises systems or other cloud environments to Microsoft 365, ensuring a seamless and secure transition. Quest On Demand Migration offers automated migration capabilities, reducing manual effort and minimizing downtime during the migration process. It includes advanced tools for managing and tracking migration tasks, as well as real-time monitoring to ensure smooth progress. It supports various migration scenarios, including Office 365 tenant-to-tenant migrations, hybrid environments, and multi-cloud migrations. With built-in reporting and analytics, it allows administrators to monitor the health of the migration and quickly resolve any issues. Quest On Demand Migration also helps with user and group management during the transition.
  • 36
    Power365

    Power365

    Quest Software

    Binary Tree Power365® Migration by Quest lets you migrate mailboxes, archives, and content for Office 365 tenant migrations, all built on Microsoft Azure, for a secure, cloud-based transformation experience. With Binary Tree Power365 Migration, you also have the option to migrate OneDrive, OneNote, and SharePoint content as well as migrate from on-premises or hosted Exchange environments. With Binary Tree Power365 Migration, you can maintain data integrity and user confidence from start to finish. It is a truly unlimited cloud migration tool — so you don’t have to worry about data caps, archive restrictions or limits on passes — for a fast, complete migration. Binary Tree Power365 Migration ensures your end-users will have a positive Office 365 tenant migration experience, as you can schedule processes and synchronization events to occur when it is convenient for your organization, minimizing impacts to your business.
  • 37
    Lentiq

    Lentiq

    Lentiq

    Lentiq is a collaborative data lake as a service environment that’s built to enable small teams to do big things. Quickly run data science, machine learning and data analysis at scale in the cloud of your choice. With Lentiq, your teams can ingest data in real time and then process, clean and share it. From there, Lentiq makes it possible to build, train and share models internally. Simply put, data teams can collaborate with Lentiq and innovate with no restrictions. Data lakes are storage and processing environments, which provide ML, ETL, schema-on-read querying capabilities and so much more. Are you working on some data science magic? You definitely need a data lake. In the Post-Hadoop era, the big, centralized data lake is a thing of the past. With Lentiq, we use data pools, which are multi-cloud, interconnected mini-data lakes. They work together to give you a stable, secure and fast data science environment.
  • 38
    SoftNAS

    SoftNAS

    Buurst

    SoftNAS is a cloud-native, software-defined, full-featured enterprise cloud NAS filer product line for primary data storage, secondary storage and hybrid cloud data integration. It enables existing applications to securely migrate to and connect with the cloud without re-engineering. With enterprise-class NAS features like: high-availability, deduplication, compression, thin-provisioning, snapshots, replication, cloning, encryption (at rest and in transit), LDAP and Active Directory integration, and support for NFS, CIFS, iSCSI or AFP storage protocols, SoftNAS protects mission-critical and primary, active/hot data, backup/archive data and makes cloud data migration faster and more reliable. SoftNAS offers the broadest range of storage options in terms of price vs. performance and backend storage selection, on-demand at petabyte scale across the AWS and Azure Marketplaces or on-premises on VMware.
  • 39
    Google Cloud Migrate for Compute Engine
    Cloud migration creates a lot of questions. Migrate for Compute Engine by Google Cloud has the answers. Whether you’re looking to migrate one application from on-premises or one thousand enterprise-grade applications across multiple data centers, Migrate for Compute Engine gives any IT team, large or small, the power to migrate their workloads to Google Cloud. With Migrate for Compute Engine’s simple “as a service” interface within Cloud Console and flexible migration options, it’s easy for anyone to reduce the time and toil that typically goes into a migration. Avoid complex deployments, setup, and configurations. Eliminate confusing and troublesome client-side migration tool agents. By using the right migration tool, you can save your migration team’s valuable time for what matters most: migrating workloads.
  • 40
    doolytic

    doolytic

    doolytic

    doolytic is leading the way in big data discovery, the convergence of data discovery, advanced analytics, and big data. doolytic is rallying expert BI users to the revolution in self-service exploration of big data, revealing the data scientist in all of us. doolytic is an enterprise software solution for native discovery on big data. doolytic is based on best-of-breed, scalable, open-source technologies. Lightening performance on billions of records and petabytes of data. Structured, unstructured and real-time data from any source. Sophisticated advanced query capabilities for expert users, Integration with R for advanced and predictive applications. Search, analyze, and visualize data from any format, any source in real-time with the flexibility of Elastic. Leverage the power of Hadoop data lakes with no latency and concurrency issues. doolytic solves common BI problems and enables big data discovery without clumsy and inefficient workarounds.
  • 41
    AWS Mainframe Modernization
    AWS Mainframe Modernization service is a unique platform that allows you to migrate and modernize your on-premises mainframe applications to a cloud-native fully-managed runtime environment on AWS. Migrate and modernize your applications to remove the hardware and staffing costs of traditional mainframes. Break up and manage your complete migration with infrastructure, software, and tools to refactor and transform legacy applications. Accelerate modernization and regression testing at scale, with a cloud-native service. AWS Mainframe Modernization is a set of managed tools providing infrastructure and software for modernizing, migrating, testing, and running mainframe applications. Accelerate your mainframe modernization journey. Improve modernization outcomes using domain expertise. Reduce project complexity and enhance cross-functional collaboration. Automate transforming legacy language applications into agile Java-based services with AWS Blu Age using newer web frameworks.
    Starting Price: $0.31 per hour
  • 42
    IBM Cloud Mass Data Migration
    IBM Cloud® Mass Data Migration uses storage devices with 120 TB of usable capacity to accelerate moving data to the cloud and overcome common transfer challenges like high costs, long transfer times and security concerns — all in a single service. Using a single IBM Cloud Mass Data Migration device, you can migrate up to 120 TB of data (at RAID-6) in just days, as opposed to weeks or months using traditional data-transfer methods. Whether you need to migrate a few terabytes or many petabytes of data, you have the flexibility to request one or multiple devices to accommodate your workload. Moving large data sets can be an expensive and time-consuming process. Use an IBM Cloud Mass Data Migration device at your location for just 50 USD per day. IBM sends you a preconfigured device for you to simply connect, ingest data and then ship back to IBM for offload into IBM Cloud Object Storage. Once offloaded, enjoy immediate access to your data in the cloud while IBM securely wipes the device.
    Starting Price: $50 per day
  • 43
    Apache Mahout

    Apache Mahout

    Apache Software Foundation

    Apache Mahout is a powerful, scalable, and versatile machine learning library designed for distributed data processing. It offers a comprehensive set of algorithms for various tasks, including classification, clustering, recommendation, and pattern mining. Built on top of the Apache Hadoop ecosystem, Mahout leverages MapReduce and Spark to enable data processing on large-scale datasets. Apache Mahout(TM) is a distributed linear algebra framework and mathematically expressive Scala DSL designed to let mathematicians, statisticians, and data scientists quickly implement their own algorithms. Apache Spark is the recommended out-of-the-box distributed back-end or can be extended to other distributed backends. Matrix computations are a fundamental part of many scientific and engineering applications, including machine learning, computer vision, and data analysis. Apache Mahout is designed to handle large-scale data processing by leveraging the power of Hadoop and Spark.
  • 44
    WhereScape

    WhereScape

    WhereScape Software

    WhereScape helps IT organizations of all sizes leverage automation to design, develop, deploy, and operate data infrastructure faster. More than 700 customers worldwide rely on WhereScape automation to eliminate hand-coding and other repetitive, time-intensive aspects of data infrastructure projects to deliver data warehouses, vaults, lakes and marts in days or weeks rather than in months or years. From data warehouses and vaults to data lakes and marts, deliver data infrastructure and big data integration fast. Quickly and easily plan, model and design all types of data infrastructure projects. Use sophisticated data discovery and profiling capabilities to bulletproof design and rapid prototyping to collaborate earlier with business users. Fast-track the development, deployment and operation of your data infrastructure projects. Dramatically reduce the delivery time, effort, cost and risk of new projects, and better position projects for future business change.
  • 45
    SAP BW/4HANA
    SAP BW/4HANA is a packaged data warehouse based on SAP HANA. As the on-premise data warehouse layer of SAP’s Business Technology Platform, it allows you to consolidate data across the enterprise to get a consistent, agreed-upon view of your data. Streamline processes and support innovations with a single source for real-time insights. Based on SAP HANA, our next-generation data warehouse solution can help you capitalize on the full value of all your data from SAP applications or third-party solutions, as well as unstructured, geospatial, or Hadoop-based. Transform data practices to gain the efficiency and agility to deploy live insights at scale, both on premise or in the cloud. Drive digitization across all lines of business with a Big Data warehouse, while leveraging digital business platform solutions from SAP.
  • 46
    VMware HCX

    VMware HCX

    Broadcom

    Seamlessly extend your on-premises environments into cloud. VMware HCX streamlines application migration, workload rebalancing and business continuity across data centers and clouds. Large-scale movement of workloads across any VMware platform. vSphere 5.0+ to any current vSphere version on cloud or modern data center. KVM and Hyper-V conversion to any current vSphere version. Support for VMware Cloud Foundation, VMware Cloud on AWS, Azure VMware Services and more. Choice of migration methodologies to meet your workload needs. Live large-scale HCX vMotion migration of 1000’s of VMs. Zero downtime migration to limit business disruption. Secure proxy for vMotion and replication traffic. Migration planning and visibility dashboard. Automated migration-aware routing with NSX for network connectivity. WAN optimized links for migration across Internet or WAN. High-throughput L2 extension. Advanced traffic engineering to optimize the application migration times.
  • 47
    jethro

    jethro

    jethro

    Data-driven decision-making has unleashed a surge of business data and a rise in user demand to analyze it. This trend drives IT departments to migrate off expensive Enterprise Data Warehouses (EDW) toward cost-effective Big Data platforms like Hadoop or AWS. These new platforms come with a Total Cost of Ownership (TCO) that is about 10 times lower. They are not ideal for interactive BI applications, however, as they fail to match the high performance and user concurrency of legacy EDWs. For this exact reason, we developed Jethro. Customers use Jethro for interactive BI on Big Data. Jethro is a transparent middle tier that requires no changes to existing apps or data. It is self-driving with no maintenance required. Jethro is compatible with BI tools like Tableau, Qlik, and Microstrategy and is data source agnostic. Jethro delivers on the demands of business users allowing for thousands of concurrent users to run complicated queries over billions of records.
  • 48
    Txture Cloud Transformation
    Txture Cloud Transformation helps the Cloud Center of Excellence and cloud consulting professionals to save costs, reduce risks and speed up complex cloud transformation projects. By automating assessment and 6R decisions, comparing cloud target architectures and facilitating migration wave planning, Txture drives your cloud transformation from beginning to end. Txture analyzes the IT landscape on the application and infrastructure level, taking business, security and compliance aspects into account when performing cloud assessments. Txture not only compares cloud providers, but also right-sizes their services and highlights cost-saving opportunities from long-term commitments. This enables you to compare your on-premise costs with potential cloud prices, helping to reduce costs throughout your entire cloud journey.
  • 49
    Movebot

    Movebot

    Couchdrop

    Get lightning-fast data movement with Movebot, the cloud-based data migration tool. Move files with ease between over 30 cloud storage platforms, on-premise file servers and mailboxes with our intuitive browser-based data moving tool. Movebot supports SharePoint, Google Workspace, Dropbox, Egnyte, Box, GCP, AWS, Azure, Outlook, Gmail and more, along with on-premise file servers and NAS appliances. There's no infrastructure to manage and Movebot scales to meet your needs automatically. Get started in minutes and move terabytes of data per day. Movebot is priced at $0.75/GB with no user costs or other fees.
    Starting Price: $0.75/GB
  • 50
    Gimmal Migrate
    Your organization has standardized on Microsoft 365 as their cloud platform. It’s your platform of the future, so why continue to pay high maintenance and upgrade costs for a legacy, on-premises system you no longer need? Gimmal Migrate helps organizations fully leverage Microsoft 365 by simplifying complex migrations from Livelink Content Server, Documentum, and file shares to SharePoint Online. As industry leaders of Content Server to SharePoint migrations, our turnkey solution ensures clients can migrate their content quickly and effectively. Increase productivity and achieve large cost savings with a modern cloud platform. Analyze, transform, and move your content to Microsoft 365 as fast as possible our Migration Api Tool (MAPIT). Migration logs allow you to provide legal chain of custody to ensure a compliant migration. By leveraging Microsoft 365, you can quickly realize savings in licensing, support, hardware, staff, and solution complexity.