Megatron

Megatron is a large, powerful transformer developed by the Applied Deep Learning Research team at NVIDIA. This repository is for ongoing research on training large transformer language models at scale. We developed efficient, model-parallel (tensor, sequence, and pipeline), and multi-node pre-training of transformer based models such as GPT, BERT, and T5 using mixed precision.

Megatron is also used in NeMo Megatron, a framework to help enterprises overcome the challenges of building and training sophisticated natural language processing models with billions and trillions of parameters.

Copyright (c) 2022, NVIDIA CORPORATION. All rights reserved.

Project Samples

Project Activity

See All Activity >

Follow Megatron

Megatron Web Site

Other Useful Business Software

Host LLMs in Production With On-Demand GPUs

NVIDIA L4 GPUs. 5-second cold starts. Scale to zero when idle.

Deploy your model, get an endpoint, pay only for compute time. No GPU provisioning or infrastructure management required.

Try Free

Rate This Project

User Reviews

Be the first to post a review of Megatron!

Additional Project Details

Programming Language

Python

Related Categories

Python Large Language Models (LLM), Python Deep Learning Frameworks, Python Generative AI

Registered

2023-03-23

Similar Business Software

GPT-NeoX

An implementation of model parallel autoregressive transformers on GPUs, based on the DeepSpeed library. This repository records EleutherAI's library for training large-scale language models on GPUs. Our current framework is based on NVIDIA's Megatron Language Model and has been augmented...

See Software
Vertex AI

Build, deploy, and scale machine learning (ML) models faster, with fully managed ML tools for any use case. Through Vertex AI Workbench, Vertex AI is natively integrated with BigQuery, Dataproc, and Spark. You can use BigQuery ML to create and execute machine learning models in BigQuery...

See Software
LM-Kit.NET

LM-Kit.NET is a cutting-edge, high-level inference SDK designed specifically to bring the advanced capabilities of Large Language Models (LLM) into the C# ecosystem. Tailored for developers working within .NET, LM-Kit.NET provides a comprehensive suite of powerful Generative AI tools, making...

See Software
LTX

Control every aspect of your video using AI, from ideation to final edits, on one holistic platform. We’re pioneering the integration of AI and video production, enabling the transformation of a single idea into a cohesive, AI-generated video. LTX empowers individuals to share their visions,...

See Software
SciSure

SciSure is dedicated to transforming laboratories worldwide with innovative digital solutions. We offer a comprehensive Digital Lab Platform (DLP), which integrates the Electronic Lab Notebook (ELN), Laboratory Information Management Systems (LIMS), machine learning, and AI. Our platform is...

See Software
Coursebox AI

Transform your content into engaging eLearning experiences with Coursebox, the #1 AI-powered eLearning authoring tool. Our platform automates the course creation process, allowing you to design a structured course in seconds. Simply make edits, add any missing elements, and your course is ready...

See Software

Report inappropriate content

Megatron

Ongoing research training transformer models at scale

Get an email when there's a new version of Megatron

Project Samples

Project Activity

Categories

Follow Megatron

User Reviews

Additional Project Details

Programming Language

Related Categories

Registered