Open Source Python Reinforcement Learning Frameworks

Python Reinforcement Learning Frameworks

View 26 business solutions

Browse free open source Python Reinforcement Learning Frameworks and projects below. Use the toggles on the left to filter open source Python Reinforcement Learning Frameworks by OS, license, language, programming language, and project status.

  • AI-powered service management for IT and enterprise teams Icon
    AI-powered service management for IT and enterprise teams

    Enterprise-grade ITSM, for every business

    Give your IT, operations, and business teams the ability to deliver exceptional services—without the complexity. Maximize operational efficiency with refreshingly simple, AI-powered Freshservice.
    Try it Free
  • Secure User Management, Made Simple | Frontegg Icon
    Secure User Management, Made Simple | Frontegg

    Get 7,500 MAUs, 50 tenants, and 5 SSOs free – integrated into your app with just a few lines of code.

    Frontegg powers modern businesses with a user management platform that’s fast to deploy and built to scale. Embed SSO, multi-tenancy, and a customer-facing admin portal using robust SDKs and APIs – no complex setup required. Designed for the Product-Led Growth era, it simplifies setup, secures your users, and frees your team to innovate. From startups to enterprises, Frontegg delivers enterprise-grade tools at zero cost to start. Kick off today.
    Start for Free
  • 1
    DeepSeek R1

    DeepSeek R1

    Open-source, high-performance AI model with advanced reasoning

    DeepSeek-R1 is an open-source large language model developed by DeepSeek, designed to excel in complex reasoning tasks across domains such as mathematics, coding, and language. DeepSeek R1 offers unrestricted access for both commercial and academic use. The model employs a Mixture of Experts (MoE) architecture, comprising 671 billion total parameters with 37 billion active parameters per token, and supports a context length of up to 128,000 tokens. DeepSeek-R1's training regimen uniquely integrates large-scale reinforcement learning (RL) without relying on supervised fine-tuning, enabling the model to develop advanced reasoning capabilities. This approach has resulted in performance comparable to leading models like OpenAI's o1, while maintaining cost-efficiency. To further support the research community, DeepSeek has released distilled versions of the model based on architectures such as LLaMA and Qwen.
    Downloads: 67 This Week
    Last Update:
    See Project
  • 2
    DeepSeek-V3

    DeepSeek-V3

    Powerful AI language model (MoE) optimized for efficiency/performance

    DeepSeek-V3 is a robust Mixture-of-Experts (MoE) language model developed by DeepSeek, featuring a total of 671 billion parameters, with 37 billion activated per token. It employs Multi-head Latent Attention (MLA) and the DeepSeekMoE architecture to enhance computational efficiency. The model introduces an auxiliary-loss-free load balancing strategy and a multi-token prediction training objective to boost performance. Trained on 14.8 trillion diverse, high-quality tokens, DeepSeek-V3 underwent supervised fine-tuning and reinforcement learning to fully realize its capabilities. Evaluations indicate that it outperforms other open-source models and rivals leading closed-source models, achieving this with a training duration of 55 days on 2,048 Nvidia H800 GPUs, costing approximately $5.58 million.
    Downloads: 63 This Week
    Last Update:
    See Project
  • 3
    Weights and Biases

    Weights and Biases

    Tool for visualizing and tracking your machine learning experiments

    Use W&B to build better models faster. Track and visualize all the pieces of your machine learning pipeline, from datasets to production models. Quickly identify model regressions. Use W&B to visualize results in real time, all in a central dashboard. Focus on the interesting ML. Spend less time manually tracking results in spreadsheets and text files. Capture dataset versions with W&B Artifacts to identify how changing data affects your resulting models. Reproduce any model, with saved code, hyperparameters, launch commands, input data, and resulting model weights. Set wandb.config once at the beginning of your script to save your hyperparameters, input settings (like dataset name or model type), and any other independent variables for your experiments. This is useful for analyzing your experiments and reproducing your work in the future. Setting configs also allows you to visualize the relationships between features of your model architecture or data pipeline and model performance.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 4
    Project Malmo

    Project Malmo

    A platform for Artificial Intelligence experimentation on Minecraft

    How can we develop artificial intelligence that learns to make sense of complex environments? That learns from others, including humans, how to interact with the world? That learns transferable skills throughout its existence, and applies them to solve new, challenging problems? Project Malmo sets out to address these core research challenges, addressing them by integrating (deep) reinforcement learning, cognitive science, and many ideas from artificial intelligence. The Malmo platform is a sophisticated AI experimentation platform built on top of Minecraft, and designed to support fundamental research in artificial intelligence. The Project Malmo platform consists of a mod for the Java version, and code that helps artificial intelligence agents sense and act within the Minecraft environment. The two components can run on Windows, Linux, or Mac OS, and researchers can program their agents in any programming language they’re comfortable with.
    Downloads: 6 This Week
    Last Update:
    See Project
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 5
    Best-of Machine Learning with Python

    Best-of Machine Learning with Python

    A ranked list of awesome machine learning Python libraries

    This curated list contains 900 awesome open-source projects with a total of 3.3M stars grouped into 34 categories. All projects are ranked by a project-quality score, which is calculated based on various metrics automatically collected from GitHub and different package managers. If you like to add or update projects, feel free to open an issue, submit a pull request, or directly edit the projects.yaml. Contributions are very welcome! General-purpose machine learning and deep learning frameworks.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 6
    TextWorld

    TextWorld

    ​TextWorld is a sandbox learning environment for the training

    TextWorld is a learning environment designed to train reinforcement learning agents to play text-based games, where actions and observations are entirely in natural language. Developed by Microsoft Research, TextWorld focuses on language understanding, planning, and interaction in complex, narrative-driven environments. It generates games procedurally, enabling scalable testing of agents’ natural language processing and decision-making abilities.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 7
    TorchRL

    TorchRL

    A modular, primitive-first, python-first PyTorch library

    TorchRL is an open-source Reinforcement Learning (RL) library for PyTorch. TorchRL provides PyTorch and python-first, low and high-level abstractions for RL that are intended to be efficient, modular, documented, and properly tested. The code is aimed at supporting research in RL. Most of it is written in Python in a highly modular way, such that researchers can easily swap components, transform them, or write new ones with little effort.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 8
    gym-pybullet-drones

    gym-pybullet-drones

    PyBullet Gymnasium environments for multi-agent reinforcement

    Gym-PyBullet-Drones is an open-source Gym-compatible environment for training and evaluating reinforcement learning agents on drone control and swarm robotics tasks. It leverages the PyBullet physics engine to simulate quadrotors and provides a platform for studying control, navigation, and coordination of single and multiple drones in 3D space.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 9
    PyBoy

    PyBoy

    Game Boy emulator written in Python

    It is highly recommended to read the report to get a light introduction to Game Boy emulation. But do be aware, that the Python implementation has changed a lot. The report is relevant, even though you want to contribute to another emulator or create your own. If you are looking to make a bot or AI, you can find all the external components in the PyBoy Documentation. There is also a short example on our Wiki page Scripts, AI and Bots as well as in the examples directory. If more features are needed, or if you find a bug, don't hesitate to make an issue here on GitHub, or write on our Discord channel. If you need more details, or if you need to compile from source, check out the detailed installation instructions. We support: macOS, Raspberry Pi (Raspbian), Linux (Ubuntu), and Windows 10.
    Downloads: 3 This Week
    Last Update:
    See Project
  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • 10
    Agent S2

    Agent S2

    Agent S: an open agentic framework that uses computers like a human

    Simular's Agent S2 represents a leap forward in the development of computer-use agents, capable of autonomously interacting with a range of devices and interfaces. By integrating specialized AI models, Agent S2 delivers state-of-the-art performance, whether on desktop systems or smartphones. Through modular architecture, it efficiently handles complex tasks, such as navigating UIs, performing low-level actions like text selection, and executing high-level strategies like planning. Additionally, the system's proactive hierarchical planning allows for real-time adaptation, making it an ideal solution for businesses seeking to streamline operations and automate digital workflows. Agent S2 is designed with flexibility, enabling seamless scaling for future applications and tasks.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 11
    Brax

    Brax

    Massively parallel rigidbody physics simulation

    Brax is a fast and fully differentiable physics engine for large-scale rigid body simulations, built on JAX. It is designed for research in reinforcement learning and robotics, enabling efficient simulations and gradient-based optimization.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 12
    Gym

    Gym

    Toolkit for developing and comparing reinforcement learning algorithms

    Gym by OpenAI is a toolkit for developing and comparing reinforcement learning algorithms. It supports teaching agents, everything from walking to playing games like Pong or Pinball. Open source interface to reinforce learning tasks. The gym library provides an easy-to-use suite of reinforcement learning tasks. Gym provides the environment, you provide the algorithm. You can write your agent using your existing numerical computation library, such as TensorFlow or Theano. It makes no assumptions about the structure of your agent, and is compatible with any numerical computation library, such as TensorFlow or Theano. The gym library is a collection of test problems — environments — that you can use to work out your reinforcement learning algorithms. These environments have a shared interface, allowing you to write general algorithms.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 13
    Habitat-Lab

    Habitat-Lab

    A modular high-level library to train embodied AI agents

    Habitat-Lab is a modular high-level library for end-to-end development in embodied AI. It is designed to train agents to perform a wide variety of embodied AI tasks in indoor environments, as well as develop agents that can interact with humans in performing these tasks. Allowing users to train agents in a wide variety of single and multi-agent tasks (e.g. navigation, rearrangement, instruction following, question answering, human following), as well as define novel tasks. Configuring and instantiating a diverse set of embodied agents, including commercial robots and humanoids, specifying their sensors and capabilities. Providing algorithms for single and multi-agent training (via imitation or reinforcement learning, or no learning at all as in SensePlanAct pipelines), as well as tools to benchmark their performance on the defined tasks using standard metrics.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 14
    OpenRLHF

    OpenRLHF

    An Easy-to-use, Scalable and High-performance RLHF Framework

    OpenRLHF is an easy-to-use, scalable, and high-performance framework for Reinforcement Learning with Human Feedback (RLHF). It supports various training techniques and model architectures.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 15
    robosuite

    robosuite

    A Modular Simulation Framework and Benchmark for Robot Learning

    Robosuite is a modular and extensible simulation framework for robotic manipulation tasks, built on top of MuJoCo. Developed by the ARISE Initiative, Robosuite offers a set of standardized benchmarks and customizable environments designed to advance research in robotic manipulation, control, and imitation learning. It emphasizes realistic simulations and ease of use for both single-task and multi-task learning.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 16
    Acme

    Acme

    A library of reinforcement learning components and agents

    Acme is a framework from DeepMind for building scalable and reproducible reinforcement learning agents. It emphasizes modular components, distributed training, and ease of experimentation.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 17
    AgentUniverse

    AgentUniverse

    agentUniverse is a LLM multi-agent framework

    AgentUniverse is a multi-agent AI framework that enables coordination between multiple intelligent agents for complex task execution and automation.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 18
    AndroidEnv

    AndroidEnv

    RL research on Android devices

    android_env is a reinforcement learning (RL) environment developed by Google DeepMind that enables agents to interact with Android applications directly as a learning environment. It provides a standardized API for training agents to perform tasks on Android apps, supporting tasks ranging from games to productivity apps, making it suitable for research in real-world RL settings.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 19
    AnyTrading

    AnyTrading

    The most simple, flexible, and comprehensive OpenAI Gym trading

    gym-anytrading is an OpenAI Gym-compatible environment designed for developing and testing reinforcement learning algorithms on trading strategies. It simulates trading environments for financial markets, including stocks and forex.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 20
    Catalyst

    Catalyst

    Accelerated deep learning R&D

    Catalyst is a PyTorch framework for accelerated Deep Learning research and development. It allows you to write compact but full-featured Deep Learning pipelines with just a few lines of code. With Catalyst you get a full set of features including a training loop with metrics, model checkpointing and more, all without the boilerplate. Catalyst is focused on reproducibility, rapid experimentation, and codebase reuse so you can break the cycle of writing another regular train loop and make something totally new. Catalyst is compatible with Python 3.6+. PyTorch 1.1+, and has been tested on Ubuntu 16.04/18.04/20.04, macOS 10.15, Windows 10 and Windows Subsystem for Linux. It's part of the PyTorch Ecosystem, as well as the Catalyst Ecosystem which includes Alchemy (experiments logging & visualization) and Reaction (convenient deep learning models serving).
    Downloads: 1 This Week
    Last Update:
    See Project
  • 21
    Deep Reinforcement Learning for Keras

    Deep Reinforcement Learning for Keras

    Deep Reinforcement Learning for Keras.

    keras-rl implements some state-of-the-art deep reinforcement learning algorithms in Python and seamlessly integrates with the deep learning library Keras. Furthermore, keras-rl works with OpenAI Gym out of the box. This means that evaluating and playing around with different algorithms is easy. Of course, you can extend keras-rl according to your own needs. You can use built-in Keras callbacks and metrics or define your own. Even more so, it is easy to implement your own environments and even algorithms by simply extending some simple abstract classes. Documentation is available online.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 22
    DouZero

    DouZero

    [ICML 2021] DouZero: Mastering DouDizhu

    DouZero is a reinforcement learning-based AI for playing DouDizhu, a popular Chinese card game. It focuses on perfecting AI strategies for competitive play using value-based deep RL techniques.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 23
    EasyRL

    EasyRL

    Reinforcement learning (RL) tutorial series

    easy-rl is a beginner-friendly reinforcement learning (RL) tutorial series and framework developed by Datawhale China. It provides educational resources and implementations of various RL algorithms to help new researchers and practitioners learn RL concepts.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 24
    Google Research Football

    Google Research Football

    Check out the new game server

    Google Research Football is a reinforcement learning environment simulating soccer matches. It focuses on learning complex behaviors such as team collaboration and strategy formation in competitive settings.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 25
    H2O LLM Studio

    H2O LLM Studio

    Framework and no-code GUI for fine-tuning LLMs

    Welcome to H2O LLM Studio, a framework and no-code GUI designed for fine-tuning state-of-the-art large language models (LLMs). You can also use H2O LLM Studio with the command line interface (CLI) and specify the configuration file that contains all the experiment parameters. To finetune using H2O LLM Studio with CLI, activate the pipenv environment by running make shell. With H2O LLM Studio, training your large language model is easy and intuitive. First, upload your dataset and then start training your model. Start by creating an experiment. You can then monitor and manage your experiment, compare experiments, or push the model to Hugging Face to share it with the community.
    Downloads: 1 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • 4
  • Next
Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.