Best Open Source ChromeOS Big Data Tools 2026

Big Data Tools for ChromeOS

Big Data ChromeOS Clear Filters

Browse free open source Big Data tools and projects for ChromeOS below. Use the toggles on the left to filter open source Big Data tools by OS, license, language, programming language, and project status.

Custom VMs From 1 to 96 vCPUs With 99.95% Uptime
General-purpose, compute-optimized, or GPU/TPU-accelerated. Built to your exact specs.

Live migration and automatic failover keep workloads online through maintenance. One free e2-micro VM every month.

Try Free
Gemini 3 and 200+ AI Models on One Platform
Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

Build generative AI apps with Vertex AI. Switch between models without switching platforms.

Start Free
1

MOA - Massive Online Analysis

Big Data Stream Analytics Framework.

A framework for learning from a continuous supply of examples, a data stream. Includes classification, regression, clustering, outlier detection and recommender systems. Related to the WEKA project, also written in Java, while scaling to adaptive large scale machine learning.

4 Reviews

Downloads: 51 This Week

Last Update: 2024-07-20
See Project
2

Open Source Data Quality and Profiling

World's first open source data quality & data preparation project

This project is dedicated to open source data quality and data preparation solutions. Data Quality includes profiling, filtering, governance, similarity check, data enrichment alteration, real time alerting, basket analysis, bubble chart Warehouse validation, single customer view etc. defined by Strategy. This tool is developing high performance integrated data management platform which will seamlessly do Data Integration, Data Profiling, Data Quality, Data Preparation, Dummy Data Creation, Meta Data Discovery, Anomaly Discovery, Data Cleansing, Reporting and Analytic. It also had Hadoop ( Big data ) support to move files to/from Hadoop Grid, Create, Load and Profile Hive Tables. This project is also known as "Aggregate Profiler" Resful API for this project is getting built as (Beta Version) https://sourceforge.net/projects/restful-api-for-osdq/ apache spark based data quality is getting built at https://sourceforge.net/projects/apache-spark-osdq/

8 Reviews

Downloads: 6 This Week

Last Update: 2021-01-20
See Project
3

Apache Polaris

Apache Polaris, the interoperable, open source catalog

Apache Polaris is an open-source metadata catalog and data management service designed to manage Apache Iceberg tables in modern data lakehouse environments. It provides a centralized catalog that allows multiple compute engines and analytics systems to interact with the same datasets through a standardized interface. By implementing the Iceberg REST catalog API, Polaris enables distributed data platforms to access shared table metadata without tightly coupling storage systems and query engines. This design allows organizations to run queries on the same Iceberg tables using tools such as Apache Spark, Flink, Trino, and other analytics engines while maintaining consistency across platforms. Polaris also focuses on data governance, security, and interoperability within large-scale cloud data architectures. Because Iceberg tables often exist across many services in a distributed ecosystem, the catalog helps coordinate metadata, schemas, and access policies in a unified system.

Downloads: 1 This Week

Last Update: 2 days ago
See Project
4

apache spark data pipeline osDQ

osDQ dedicated to create apache spark based data pipeline using JSON

This is an offshoot project of open source data quality (osDQ) project https://sourceforge.net/projects/dataquality/ This sub project will create apache spark based data pipeline where JSON based metadata (file) will be used to run data processing , data pipeline , data quality and data preparation and data modeling features for big data. This uses java API of apache spark. It can run in local mode also. Get json example at https://github.com/arrahtech/osdq-spark How to run Unzip the zip file Windows : java -cp .\lib\*;osdq-spark-0.0.1.jar org.arrah.framework.spark.run.TransformRunner -c .\example\samplerun.json Mac UNIX java -cp ./lib/*:./osdq-spark-0.0.1.jar org.arrah.framework.spark.run.TransformRunner -c ./example/samplerun.json For those on windows, you need to have hadoop distribtion unzipped on local drive and HADOOP_HOME set. Also copy winutils.exe from here into HADOOP_HOME\bin

Downloads: 10 This Week

Last Update: 2019-01-20
See Project
AI-generated apps that pass security review
Stop waiting on engineering. Build production-ready internal tools with AI—on your company data, in your cloud.

Retool lets you generate dashboards, admin panels, and workflows directly on your data. Type something like “Build me a revenue dashboard on my Stripe data” and get a working app with security, permissions, and compliance built in from day one. Whether on our cloud or self-hosted, create the internal software your team needs without compromising enterprise standards or control.

Try Retool free
5

Apache Doris

MPP-based interactive SQL data warehousing for reporting and analysis

Apache Doris is a modern MPP analytical database product. It can provide sub-second queries and efficient real-time data analysis. With it's distributed architecture, up to 10PB level datasets will be well supported and easy to operate. Apache Doris can meet various data analysis demands, including history data reports, real-time data analysis, interactive data analysis, and exploratory data analysis. Make your data analysis easier! Support standard SQL language, compatible with MySQL protocol. The main advantages of Doris are the simplicity (of developing, deploying and using) and meeting many data serving requirements in a single system. Doris mainly integrates the technology of Google Mesa and Apache Impala, and it is based on a column-oriented storage engine and can communicate by MySQL client.

Downloads: 0 This Week

Last Update: 4 days ago
See Project
6

Big Sack

Big Sack: A lightweight Java Key/Value store with undo and disk cache.

Big Sack is a Java persistence mechanism that allows storage of key value pairs following the popular Big Data paradigms. Its a very simple and straightforward way to bridge the gap between in-memory data structures and long-term storage. It has the convenience of Java SDK TreeMap and TreeSet classes and is used the same easy way, but it includes rollback through undo logging to checkpoint data so it does not wind up in an unknown state regardless of failures. Data storage in the exabyte range is possible using filesystem and/or memory-mapped IO. Three levels of configurable write-through caching at different granularities ensure performance.

Downloads: 0 This Week

Last Update: 2013-12-21
See Project
7

Relation Tags

Source code for be able to use Relation Tags.

Source code for be able to use Relation Tags. It is part of project VocabularyMem but can be used separately. Relation Tags are tags which can be relationed together . For example tag "Paris" and tag "France" can be relationed with a relation "is part of". This code is created from 0 and is able to define which type of relation we use, using most elemental mathematic properties. It is strongly recommended to read "Relation Tags guide for programmers". Inside source zip, also contains dialogs for set properties of this extended tags. All this dialogs files finish either with "...dlg.cpp" or ",,,dlg.h". Please read "readme" file. It is recommended to use a binary matrix class like BinMatrix in order to have enough speed for calculations of implicit relations in a system of bogus tags with big data. Need to be compiled with C++11 and Qt libraries

Downloads: 0 This Week

Last Update: 2015-08-11
See Project
8

Universal Java Matrix Package

sparse and dense matrix, linear algebra, visualization, big data

The Universal Java Matrix Package (UJMP) is an open source Java library which provides sparse and dense matrix classes, as well as a large number of calculations for linear algebra such as matrix multiplication or matrix inverse. Operations such as mean, correlation, standard deviation, replacement of missing values or the calculation of mutual information are supported, too. The Universal Java Matrix Package provides various visualization methods, import and export filters for a large number of file formats, and even the possibility to link to JDBC databases. Multi-dimensional matrices as well as generic matrices with a specified object type are supported and very large matrices can be handled even when they do not fit into memory.

1 Review

Downloads: 0 This Week

Last Update: 2015-08-19
See Project