Open Source Linux Machine Learning Software - Page 5

Machine Learning Software for Linux

View 57 business solutions
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 8 Monitoring Tools in One APM. Install in 5 Minutes. Icon
    8 Monitoring Tools in One APM. Install in 5 Minutes.

    Errors, performance, logs, uptime, hosts, anomalies, dashboards, and check-ins. One interface.

    AppSignal works out of the box for Ruby, Elixir, Node.js, Python, and more. 30-day free trial, no credit card required.
    Start Free
  • 1
    NSFW Data Scraper

    NSFW Data Scraper

    Collection of scripts to aggregate image data

    NSFW Data Scraper is an open-source project that provides scripts for automatically collecting large datasets of images intended for training NSFW image classification systems. The repository focuses on aggregating image data from various online sources so that developers can build datasets suitable for training content moderation models. These datasets typically contain images categorized into different classes associated with adult or explicit content, which can then be used to train neural networks that detect unsafe or inappropriate material. The scripts automate the process of downloading and organizing large volumes of images, significantly reducing the manual effort required to build training datasets. The project was originally created to support research and development of machine learning models capable of identifying explicit or sensitive visual content.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 2
    NVIDIA NeMo

    NVIDIA NeMo

    Toolkit for conversational AI

    NVIDIA NeMo, part of the NVIDIA AI platform, is a toolkit for building new state-of-the-art conversational AI models. NeMo has separate collections for Automatic Speech Recognition (ASR), Natural Language Processing (NLP), and Text-to-Speech (TTS) models. Each collection consists of prebuilt modules that include everything needed to train on your data. Every module can easily be customized, extended, and composed to create new conversational AI model architectures. Conversational AI architectures are typically large and require a lot of data and compute for training. NeMo uses PyTorch Lightning for easy and performant multi-GPU/multi-node mixed-precision training. Supported models: Jasper, QuartzNet, CitriNet, Conformer-CTC, Conformer-Transducer, Squeezeformer-CTC, Squeezeformer-Transducer, ContextNet, LSTM-Transducer (RNNT), LSTM-CTC. NGC collection of pre-trained speech processing models.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 3
    NeuralPDE.jl

    NeuralPDE.jl

    Physics-Informed Neural Networks (PINN) Solvers

    NeuralPDE.jl is a Julia library for solving partial differential equations (PDEs) using physics-informed neural networks and scientific machine learning. Built on top of the SciML ecosystem, it provides a flexible and composable interface for defining PDEs and training neural networks to approximate their solutions. NeuralPDE.jl enables hybrid modeling, data-driven discovery, and fast PDE solvers in high dimensions, making it suitable for scientific research and engineering applications.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 4
    Qlib

    Qlib

    Qlib is an AI-oriented quantitative investment platform

    Qlib is an AI-oriented quantitative investment platform, which aims to realize the potential, empower the research, and create the value of AI technologies in quantitative investment. With Qlib, you can easily try your ideas to create better Quant investment strategies. An increasing number of SOTA Quant research works/papers are released in Qlib. With Qlib, users can easily try their ideas to create better Quant investment strategies. At the module level, Qlib is a platform that consists of above components. The components are designed as loose-coupled modules and each component could be used stand-alone.
    Downloads: 3 This Week
    Last Update:
    See Project
  • Gemini 3 and 200+ AI Models on One Platform Icon
    Gemini 3 and 200+ AI Models on One Platform

    Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

    Build generative AI apps with Vertex AI. Switch between models without switching platforms.
    Start Free
  • 5
    Stanford Machine Learning Course

    Stanford Machine Learning Course

    machine learning course programming exercise

    The Stanford Machine Learning Course Exercises repository contains programming assignments from the well-known Stanford Machine Learning online course. It includes implementations of a variety of fundamental algorithms using Python and MATLAB/Octave. The repository covers a broad set of topics such as linear regression, logistic regression, neural networks, clustering, support vector machines, and recommender systems. Each folder corresponds to a specific algorithm or concept, making it easy for learners to navigate and practice. The exercises serve as practical, hands-on reinforcement of theoretical concepts taught in the course. This collection is valuable for students and practitioners who want to strengthen their skills in machine learning through coding exercises.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 6
    TTS

    TTS

    Deep learning for text to speech

    TTS is a library for advanced Text-to-Speech generation. It's built on the latest research, was designed to achieve the best trade-off among ease-of-training, speed, and quality. TTS comes with pre-trained models, tools for measuring dataset quality, and is already used in 20+ languages for products and research projects. Released models in PyTorch, Tensorflow and TFLite. Tools to curate Text2Speech datasets underdataset_analysis. Demo server for model testing. Notebooks for extensive model benchmarking. Modular (but not too much) code base enabling easy testing for new ideas. Text2Spec models (Tacotron, Tacotron2, Glow-TTS, SpeedySpeech). Speaker Encoder to compute speaker embeddings efficiently. Vocoder models (MelGAN, Multiband-MelGAN, GAN-TTS, ParallelWaveGAN, WaveGrad, WaveRNN). If you are only interested in synthesizing speech with the released TTS models, installing from PyPI is the easiest option.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 7
    TabPFN

    TabPFN

    Foundation Model for Tabular Data

    TabPFN is an open-source machine learning system that introduces a foundation model designed specifically for tabular data analysis. The model is based on transformer architectures and implements a prior-data fitted network that can perform supervised learning tasks such as classification and regression with minimal configuration. Unlike many traditional machine learning workflows that require extensive hyperparameter tuning and training cycles, TabPFN is pre-trained to perform inference directly on tabular datasets. This allows it to generate predictions extremely quickly, often within seconds, while maintaining competitive accuracy on small and medium-sized datasets. The system supports a variety of tabular machine learning tasks and is designed to handle structured datasets commonly found in spreadsheets, databases, and business analytics systems.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 8
    Thinc

    Thinc

    A refreshing functional take on deep learning

    Thinc is a lightweight deep learning library that offers an elegant, type-checked, functional-programming API for composing models, with support for layers defined in other frameworks such as PyTorch, TensorFlow and MXNet. You can use Thinc as an interface layer, a standalone toolkit or a flexible way to develop new models. Previous versions of Thinc have been running quietly in production in thousands of companies, via both spaCy and Prodigy. We wrote the new version to let users compose, configure and deploy custom models built with their favorite framework. Switch between PyTorch, TensorFlow and MXNet models without changing your application, or even create mutant hybrids using zero-copy array interchange. Develop faster and catch bugs sooner with sophisticated type checking. Trying to pass a 1-dimensional array into a model that expects two dimensions? That’s a type error. Your editor can pick it up as the code leaves your fingers.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 9
    huggingface_hub

    huggingface_hub

    The official Python client for the Huggingface Hub

    The huggingface_hub library allows you to interact with the Hugging Face Hub, a platform democratizing open-source Machine Learning for creators and collaborators. Discover pre-trained models and datasets for your projects or play with the thousands of machine-learning apps hosted on the Hub. You can also create and share your own models, datasets, and demos with the community. The huggingface_hub library provides a simple way to do all these things with Python.
    Downloads: 3 This Week
    Last Update:
    See Project
  • Fully Managed MySQL, PostgreSQL, and SQL Server Icon
    Fully Managed MySQL, PostgreSQL, and SQL Server

    Automatic backups, patching, replication, and failover. Focus on your app, not your database.

    Cloud SQL handles your database ops end to end, so you can focus on your app.
    Try Free
  • 10
    supabase-py

    supabase-py

    Python Client for Supabase. Query Postgres from Flask, Django

    Python Client for Supabase. Query Postgres from Flask, Django, FastAPI. Python user authentication, security policies, edge functions, file storage, and realtime data streaming. Good first issue.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 11
    Clustering Variation looks for a good subset of attributes in order to improve the classification accuracy of supervised learning techniques in classification problems with a huge number of attributes involved. It first creates a ranking of attributes based on the Variation value, then divide into two groups, last using Verification method to select the best group.
    Downloads: 23 This Week
    Last Update:
    See Project
  • 12
    AI-Tutorials/Implementations Notebooks

    AI-Tutorials/Implementations Notebooks

    Codes/Notebooks for AI Projects

    AI-Tutorials/Implementations Notebooks repository is a comprehensive collection of artificial intelligence tutorials and implementation examples intended for developers, students, and researchers who want to learn by building practical AI projects. The repository contains numerous Jupyter notebooks and code samples that demonstrate modern techniques in machine learning, deep learning, data science, and large language model workflows. It includes implementations for a wide range of AI topics such as computer vision, agent systems, federated learning, distributed systems, adversarial attacks, and generative AI. Many of the tutorials focus on building AI agents, multi-agent systems, and workflows that integrate language models with external tools or APIs. The codebase acts as a hands-on learning resource, allowing users to experiment with new frameworks, architectures, and machine learning workflows through guided examples.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 13
    AIGC-Interview-Book

    AIGC-Interview-Book

    AIGC algorithm engineer interview secrets

    AIGC-Interview-Book is a large educational repository designed to help engineers prepare for technical interviews related to artificial intelligence and generative AI roles. The project compiles knowledge from industry practitioners and researchers into a structured reference covering the AI ecosystem. Topics included in the repository span large language models, generative AI systems, traditional deep learning methods, reinforcement learning, computer vision, natural language processing, and machine learning theory. In addition to technical concepts, the repository also contains interview preparation materials such as practice questions, hiring insights, and career advice for AI engineers. The materials are organized so readers can study fundamental topics as well as advanced research areas that frequently appear in technical interviews.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 14
    AIMET

    AIMET

    AIMET is a library that provides advanced quantization and compression

    Qualcomm Innovation Center (QuIC) is at the forefront of enabling low-power inference at the edge through its pioneering model-efficiency research. QuIC has a mission to help migrate the ecosystem toward fixed-point inference. With this goal, QuIC presents the AI Model Efficiency Toolkit (AIMET) - a library that provides advanced quantization and compression techniques for trained neural network models. AIMET enables neural networks to run more efficiently on fixed-point AI hardware accelerators. Quantized inference is significantly faster than floating point inference. For example, models that we’ve run on the Qualcomm® Hexagon™ DSP rather than on the Qualcomm® Kryo™ CPU have resulted in a 5x to 15x speedup. Plus, an 8-bit model also has a 4x smaller memory footprint relative to a 32-bit model. However, often when quantizing a machine learning model (e.g., from 32-bit floating point to an 8-bit fixed point value), the model accuracy is sacrificed.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 15
    Aerosolve

    Aerosolve

    A machine learning package built for humans

    Aerosolve is an open-source machine learning library developed by Airbnb, designed for interpretable and human-friendly modeling. Built around sparse, human-intuitive features (like geography, pricing), it supports feature quantization, interaction specification, and rule-based priors—enabling domain experts to contribute directly to model behavior.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 16
    Alink

    Alink

    Alink is the Machine Learning algorithm platform based on Flink

    Alink is Alibaba’s scalable machine learning algorithm platform built on Apache Flink, designed for batch and stream data processing. It provides a wide variety of ready-to-use ML algorithms for tasks like classification, regression, clustering, recommendation, and more. Written in Java and Scala, Alink is suitable for enterprise-grade big data applications where performance and scalability are crucial. It supports model training, evaluation, and deployment in real-time environments and integrates seamlessly into Alibaba’s cloud ecosystem.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 17
    AutoMLOps

    AutoMLOps

    Build MLOps Pipelines in Minutes

    AutoMLOps is a service that generates, provisions, and deploys CI/CD integrated MLOps pipelines, bridging the gap between Data Science and DevOps. AutoMLOps provides a repeatable process that dramatically reduces the time required to build MLOps pipelines. The service generates a containerized MLOps codebase, provides infrastructure-as-code to provision and maintain the underlying MLOps infra, and provides deployment functionalities to trigger and run MLOps pipelines. AutoMLOps gives flexibility over the tools and technologies used in the MLOps pipelines, allowing users to choose from a wide range of options for artifact repositories, build tools, provisioning tools, orchestration frameworks, and source code repositories. AutoMLOps can be configured to either use existing infra, or provision new infra, including source code repositories for versioning the generated MLOps codebase, build configs and triggers, artifact repositories for storing docker containers, storage buckets, etc.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 18
    Caffe

    Caffe

    A fast open framework for deep learning

    Caffe is an open source deep learning framework that’s focused on expression, speed and modularity. It’s got an expressive architecture that encourages application and innovation, and extensible code that’s great for active development. Caffe also offers great speed, capable of processing over 60M images per day with a single NVIDIA K40 GPU. It’s arguably one of the fastest convnet implementations around. Caffe is developed by the Berkeley AI Research (BAIR)/The Berkeley Vision and Learning Center (BVLC) and a great community of contributors that continue to make Caffe state-of-the-art in both code and models. It’s been used in numerous projects, from startup prototypes and academic research projects, to large scale industrial applications.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 19
    Chemprop

    Chemprop

    Message Passing Neural Networks for Molecule Property Prediction

    Chemprop is a repository containing message-passing neural networks for molecular property prediction.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 20
    Computer vision projects

    Computer vision projects

    computer vision projects | Fun AI projects related to computer vision

    Computer vision projects is an open-source collection of computer vision projects and experiments that demonstrate practical applications of modern AI techniques in image processing, robotics, and real-time visual analysis. The repository includes multiple demonstration systems implemented using languages such as Python and C++, covering topics ranging from object detection to embedded vision systems. Many of the projects illustrate how computer vision algorithms can interact with hardware platforms, including robotics systems and edge computing devices. The repository provides examples that combine machine learning models with real-world applications such as robotic arms, video analysis, and automated visual measurement systems.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 21
    D2L.ai

    D2L.ai

    Interactive deep learning book with multi-framework code

    Interactive deep learning book with multi-framework code, math, and discussions. Adopted at 300 universities from 55 countries including Stanford, MIT, Harvard, and Cambridge. This open-source book represents our attempt to make deep learning approachable, teaching you the concepts, the context, and the code. The entire book is drafted in Jupyter notebooks, seamlessly integrating exposition figures, math, and interactive examples with self-contained code. Offers sufficient technical depth to provide a starting point on the path to actually becoming an applied machine learning scientist.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 22
    DGL

    DGL

    Python package built to ease deep learning on graph

    Build your models with PyTorch, TensorFlow or Apache MXNet. Fast and memory-efficient message passing primitives for training Graph Neural Networks. Scale to giant graphs via multi-GPU acceleration and distributed training infrastructure. DGL empowers a variety of domain-specific projects including DGL-KE for learning large-scale knowledge graph embeddings, DGL-LifeSci for bioinformatics and cheminformatics, and many others. We are keen to bringing graphs closer to deep learning researchers. We want to make it easy to implement graph neural networks model family. We also want to make the combination of graph based modules and tensor based modules (PyTorch or MXNet) as smooth as possible. DGL provides a powerful graph object that can reside on either CPU or GPU. It bundles structural data as well as features for a better control. We provide a variety of functions for computing with graph objects including efficient and customizable message passing primitives for Graph Neural Networks.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 23
    Darts

    Darts

    A python library for easy manipulation and forecasting of time series

    darts is a Python library for easy manipulation and forecasting of time series. It contains a variety of models, from classics such as ARIMA to deep neural networks. The models can all be used in the same way, using fit() and predict() functions, similar to scikit-learn. The library also makes it easy to backtest models, combine the predictions of several models, and take external data into account. Darts supports both univariate and multivariate time series and models. The ML-based models can be trained on potentially large datasets containing multiple time series, and some of the models offer a rich support for probabilistic forecasting. We recommend to first setup a clean Python environment for your project with at least Python 3.7 using your favorite tool (conda, venv, virtualenv with or without virtualenvwrapper).
    Downloads: 2 This Week
    Last Update:
    See Project
  • 24
    DataFrame

    DataFrame

    C++ DataFrame for statistical, Financial, and ML analysis

    This is a C++ analytical library designed for data analysis similar to libraries in Python and R. For example, you would compare this to Pandas, R data.frame, or Polars. You can slice the data in many different ways. You can join, merge, and group-by the data. You can run various statistical, summarization, financial, and ML algorithms on the data. You can add your custom algorithms easily. You can multi-column sort, custom pick, and delete the data. DataFrame also includes a large collection of analytical algorithms in the form of visitors. These are from basic stats such as Mean, and Std Deviation and return, … to more involved analysis such as Affinity Propagation, Polynomial Fit, and Fast Fourier transform of arbitrary length … including a good collection of trading indicators. You can also easily add your own algorithms.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 25
    GROBID

    GROBID

    A machine learning software for extracting information

    GROBID is a machine learning library for extracting, parsing, and re-structuring raw documents such as PDF into structured XML/TEI encoded documents with a particular focus on technical and scientific publications. First developments started in 2008 as a hobby. In 2011 the tool has been made available in open source. Work on GROBID has been steady as a side project since the beginning and is expected to continue as such. Header extraction and parsing from article in PDF format. The extraction here covers the usual bibliographical information (e.g. title, abstract, authors, affiliations, keywords, etc.). References extraction and parsing from articles in PDF format, around .87 F1-score against on an independent PubMed Central set of 1943 PDF containing 90,125 references, and around .89 on a similar bioRxiv set of 2000 PDF (using the Deep Learning citation model). All the usual publication metadata are covered (including DOI, PMID, etc.).
    Downloads: 2 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB