Showing 2602 open source projects for "scikit-learn"

View related business solutions
  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • Context for your AI agents Icon
    Context for your AI agents

    Crawl websites, sync to vector databases, and power RAG applications. Pre-built integrations for LLM pipelines and AI assistants.

    Build data pipelines that feed your AI models and agents without managing infrastructure. Crawl any website, transform content, and push directly to your preferred vector store. Use 10,000+ tools for RAG applications, AI assistants, and real-time knowledge bases. Monitor site changes, trigger workflows on new data, and keep your AIs fed with fresh, structured information. Cloud-native, API-first, and free to start until you need to scale.
    Try for free
  • 1
    scikit-learn

    scikit-learn

    Machine learning in Python

    scikit-learn is an open source Python module for machine learning built on NumPy, SciPy and matplotlib. It offers simple and efficient tools for predictive data analysis and is reusable in various contexts.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 2
    gplearn

    gplearn

    Genetic Programming in Python, with a scikit-learn inspired API

    gplearn implements Genetic Programming in Python, with a scikit-learn-inspired and compatible API. While Genetic Programming (GP) can be used to perform a very wide variety of tasks, gplearn is purposefully constrained to solving symbolic regression problems. This is motivated by the scikit-learn ethos, of having powerful estimators that are straightforward to implement. Symbolic regression is a machine learning technique that aims to identify an underlying mathematical expression that best describes a relationship. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 3
    scikit-image

    scikit-image

    Image processing in Python

    scikit-image is a collection of algorithms for image processing. It is available free of charge and free of restriction. We pride ourselves on high-quality, peer-reviewed code, written by an active community of volunteers. scikit-image builds on scipy.ndimage to provide a versatile set of image processing routines in Python. This library is developed by its community, and contributions are most welcome!
    Downloads: 1 This Week
    Last Update:
    See Project
  • 4
    Scikit-LLM

    Scikit-LLM

    Seamlessly integrate LLMs into scikit-learn

    Seamlessly integrate powerful language models like ChatGPT into sci-kit-learn for enhanced text analysis tasks. At the moment the majority of the Scikit-LLM estimators are only compatible with some of the OpenAI models. Hence, a user-provided OpenAI API key is required. Additionally, Scikit-LLM will ensure that the obtained response contains a valid label. If this is not the case, a label will be selected randomly (label probabilities are proportional to label occurrences in the training set). ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • AestheticsPro Medical Spa Software Icon
    AestheticsPro Medical Spa Software

    Our new software release will dramatically improve your medspa business performance while enhancing the customer experience

    AestheticsPro is the most complete Aesthetics Software on the market today. HIPAA Cloud Compliant with electronic charting, integrated POS, targeted marketing and results driven reporting; AestheticsPro delivers the tools you need to manage your medical spa business. It is our mission To Provide an All-in-One Cutting Edge Software to the Aesthetics Industry.
    Learn More
  • 5
    SKORCH

    SKORCH

    A scikit-learn compatible neural network library that wraps PyTorch

    A scikit-learn compatible neural network library that wraps PyTorch.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 6
    Learn Elixir

    Learn Elixir

    Learn the Elixir programming language to build functional web apps

    learn-elixir is a practical, beginner-friendly guide to learning Elixir and the Erlang VM (BEAM), emphasizing why Elixir scales—lightweight processes, immutable data, robust GC, and supervisors—and how to apply those strengths in real projects. The repo walks you from installation on macOS, Ubuntu, and Windows to an interactive workflow using iex, Livebook, and even a one-line Docker run for a zero-install setup.
    Downloads: 8 This Week
    Last Update:
    See Project
  • 7
    HyperTools

    HyperTools

    A Python toolbox for gaining geometric insights

    HyperTools is a library for visualizing and manipulating high-dimensional data in Python. It is built on top of matplotlib (for plotting), seaborn (for plot styling), and scikit-learn (for data manipulation). Functions for plotting high-dimensional datasets in 2/3D. Static and animated plots. Simple API for customizing plot styles. Set of powerful data manipulation tools including hyperalignment, k-means clustering, normalizing and more. Support for lists of Numpy arrays, Pandas dataframes, text or (mixed) lists. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    imbalanced-learn

    imbalanced-learn

    A Python Package to Tackle the Curse of Imbalanced Datasets in ML

    Imbalanced-learn (imported as imblearn) is an open source, MIT-licensed library relying on scikit-learn (imported as sklearn) and provides tools when dealing with classification with imbalanced classes.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    Dask

    Dask

    Parallel computing with task scheduling

    Dask is a Python library for parallel and distributed computing, designed to scale analytics workloads from single machines to large clusters. It integrates with familiar tools like NumPy, Pandas, and scikit-learn while enabling execution across cores or nodes with minimal code changes. Dask excels at handling large datasets that don’t fit into memory and is widely used in data science, machine learning, and big data pipelines.
    Downloads: 1 This Week
    Last Update:
    See Project
  • Trumba is an All-in-one Calendar Management and Event Registration platform Icon
    Trumba is an All-in-one Calendar Management and Event Registration platform

    Great for live, virtual and hybrid events

    Publish, promote and track your events more affordably and effectively—all in one place.
    Learn More
  • 10
    Learn Git Branching

    Learn Git Branching

    An interactive git visualization and tutorial

    LearnGitBranching (LGB) is a Git repository visualizer, sandbox, and interactive tutorial platform that teaches Git concepts through visualization and gamified challenges. Instead of only typing commands into the terminal, users see a live commit tree update dynamically as they experiment with branching, merging, rebasing, and more. It features both sandbox mode for free exploration and structured levels to guide learners through Git fundamentals and advanced workflows. Designed entirely...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 11
    Learn AI Engineering

    Learn AI Engineering

    Learn AI and LLMs from scratch using free resources

    Learn AI Engineering is a learning path for AI engineering that consolidates high-quality, free resources across the full stack: math, Python foundations, machine learning, deep learning, LLMs, agents, tooling, and deployment. Rather than a loose bookmark list, it organizes topics into a progression so learners can start from fundamentals and move toward practical, production-oriented skills.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 12
    TPOT

    TPOT

    A Python Automated Machine Learning tool that optimizes ML

    Consider TPOT your Data Science Assistant. TPOT is a Python Automated Machine Learning tool that optimizes machine learning pipelines using genetic programming. TPOT stands for Tree-based Pipeline Optimization Tool. Consider TPOT your Data Science Assistant. TPOT is a Python Automated Machine Learning tool that optimizes machine learning pipelines using genetic programming.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    handson-ml3

    handson-ml3

    Fundamentals of Machine Learning and Deep Learning

    handson-ml3 contains the Jupyter notebooks and code for the third edition of the book Hands-On Machine Learning with Scikit-Learn, Keras & TensorFlow. It guides readers through modern machine learning and deep learning workflows using Python, with examples spanning data preparation, supervised and unsupervised learning, deep neural networks, RL, and production-ready model deployment. The third edition updates the content for TensorFlow 2 and Keras, introduces new chapters (for example on reinforcement learning or generative models), and offers best-practice code that reflects current ecosystems. ...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 14
    Learn Go with Tests

    Learn Go with Tests

    Learn Go with test-driven development

    Katas are fun but they are usually limited in their scope for learning a language; you're unlikely to use goroutines to solve a kata. Another problem is when you have varying levels of enthusiasm. Some people just learn way more of the language than others and when demonstrating what they have done end up confusing people with features the others are not familiar with. This ends up making the learning feel quite unstructured and ad hoc. By far the most effective way was by slowly introducing the fundamentals of the language by reading through go by example, exploring them with examples and discussing them as a group. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    Python Outlier Detection

    Python Outlier Detection

    A Python toolbox for scalable outlier detection

    ...PyOD has multiple neural network-based models, e.g., AutoEncoders, which are implemented in both PyTorch and Tensorflow. PyOD contains multiple models that also exist in scikit-learn. It is possible to train and predict with a large number of detection models in PyOD by leveraging SUOD framework. A benchmark is supplied for select algorithms to provide an overview of the implemented models. In total, 17 benchmark datasets are used for comparison, which can be downloaded at ODDS.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 16
    solo-learn

    solo-learn

    Library of self-supervised methods for visual representation

    ...We aim at providing SOTA self-supervised methods in a comparable environment while, at the same time, implementing training tricks. The library is self-contained, but it is possible to use the models outside of solo-learn.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    UMAP

    UMAP

    Uniform Manifold Approximation and Projection

    Uniform Manifold Approximation and Projection (UMAP) is a dimension reduction technique that can be used for visualization similarly to t-SNE, but also for general non-linear dimension reduction. It is possible to model the manifold with a fuzzy topological structure. The embedding is found by searching for a low-dimensional projection of the data that has the closest possible equivalent fuzzy topological structure. First of all UMAP is fast. It can handle large datasets and high dimensional...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 18
    mlforecast

    mlforecast

    Scalable machine learning for time series forecasting

    mlforecast is a time-series forecasting framework built around machine-learning models, designed to make forecasting both efficient and scalable. It lets you apply any regressor that follows the typical scikit-learn API, for example, gradient-boosted trees or linear models, to time-series data by automating much of the messy feature engineering and data preparation. Instead of writing custom code to build lagged features, rolling statistics, and date-based predictors, mlforecast generates those automatically based on a simple configuration. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    Pfl Research

    Pfl Research

    Simulation framework for accelerating research

    A fast, modular Python framework released by Apple for privacy-preserving federated learning (PFL) simulation. Integrates with TensorFlow, PyTorch, and classical ML, and offers high-speed distributed simulation (7–72× faster than alternatives).
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    pmdarima

    pmdarima

    Statistical library designed to fill the void in Python's time series

    A statistical library designed to fill the void in Python's time series analysis capabilities, including the equivalent of R's auto.arima function.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    NGBoost

    NGBoost

    Natural Gradient Boosting for Probabilistic Prediction

    ngboost is a Python library that implements Natural Gradient Boosting, as described in "NGBoost: Natural Gradient Boosting for Probabilistic Prediction". It is built on top of Scikit-Learn and is designed to be scalable and modular with respect to the choice of proper scoring rule, distribution, and base learner. A didactic introduction to the methodology underlying NGBoost is available in this slide deck.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    Metaflow

    Metaflow

    A framework for real-life data science

    Metaflow is a human-friendly Python library that helps scientists and engineers build and manage real-life data science projects. Metaflow was originally developed at Netflix to boost productivity of data scientists who work on a wide variety of projects from classical statistics to state-of-the-art deep learning.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 23
    The Arcade Learning Environment

    The Arcade Learning Environment

    The Arcade Learning Environment (ALE) -- a platform for AI research

    Arcade Learning Environment (ALE) is a widely used open-source framework that wraps hundreds of Atari 2600 games via an emulator and presents them as RL environments for AI agents. It decouples the game/emulation aspects from the agent interface, providing a clean API (C++, Python, Gymnasium) so researchers can focus on agent design rather than game plumbing. This environment suite has been central to many RL breakthroughs, including value-based agents, deep Q-nets, and general-agent...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    Learn Regex The Easy Way

    Learn Regex The Easy Way

    Learn regex the easy way

    Learn Regex The Easy Way is a hands-on, interactive resource for learning regular expressions (regex) in a step-by-step, incremental way. Rather than just being a reference sheet, it is designed to help you build understanding gradually: you start with the basics like literal matching, then advance through character classes, quantifiers, groups, alternation, lookaheads/lookbehinds, and more advanced regex features.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    Advanced Solutions Lab

    Advanced Solutions Lab

    This repos contains notebooks for the Advanced Solutions Lab

    This repository contains Jupyter notebooks meant to be run on Vertex AI. This is maintained by Google Cloud’s Advanced Solutions Lab (ASL) team. Vertex AI is the next-generation AI Platform on the Google Cloud Platform. The material covered in this repo will take a software engineer with no exposure to machine learning to an advanced level.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • 4
  • 5
  • Next