OPT

OPT

Meta
PanGu-Σ

PanGu-Σ

Huawei
+
+

Related Products

  • Vertex AI
    827 Ratings
    Visit Website
  • Google AI Studio
    11 Ratings
    Visit Website
  • LM-Kit.NET
    24 Ratings
    Visit Website
  • RaimaDB
    10 Ratings
    Visit Website
  • ClickLearn
    66 Ratings
    Visit Website
  • Dragonfly
    16 Ratings
    Visit Website
  • Ango Hub
    15 Ratings
    Visit Website
  • RunPod
    205 Ratings
    Visit Website
  • Qloo
    23 Ratings
    Visit Website
  • PackageX OCR Scanning
    46 Ratings
    Visit Website

About

Large language models, which are often trained for hundreds of thousands of compute days, have shown remarkable capabilities for zero- and few-shot learning. Given their computational cost, these models are difficult to replicate without significant capital. For the few that are available through APIs, no access is granted to the full model weights, making them difficult to study. We present Open Pre-trained Transformers (OPT), a suite of decoder-only pre-trained transformers ranging from 125M to 175B parameters, which we aim to fully and responsibly share with interested researchers. We show that OPT-175B is comparable to GPT-3, while requiring only 1/7th the carbon footprint to develop. We are also releasing our logbook detailing the infrastructure challenges we faced, along with code for experimenting with all of the released models.

About

Significant advancements in the field of natural language processing, understanding, and generation have been achieved through the expansion of large language models. This study introduces a system which utilizes Ascend 910 AI processors and the MindSpore framework to train a language model with over a trillion parameters, specifically 1.085T, named PanGu-{\Sigma}. This model, which builds upon the foundation laid by PanGu-{\alpha}, takes the traditionally dense Transformer model and transforms it into a sparse one using a concept known as Random Routed Experts (RRE). The model was efficiently trained on a dataset of 329 billion tokens using a technique called Expert Computation and Storage Separation (ECSS), leading to a 6.3-fold increase in training throughput via heterogeneous computing. Experimentation indicates that PanGu-{\Sigma} sets a new standard in zero-shot learning for various downstream Chinese NLP tasks.

Platforms Supported

Windows
Mac
Linux
Cloud
On-Premises
iPhone
iPad
Android
Chromebook

Platforms Supported

Windows
Mac
Linux
Cloud
On-Premises
iPhone
iPad
Android
Chromebook

Audience

AI developers interested in a large language model

Audience

AI developers

Support

Phone Support
24/7 Live Support
Online

Support

Phone Support
24/7 Live Support
Online

API

Offers API

API

Offers API

Screenshots and Videos

No images available

Screenshots and Videos

No images available

Pricing

No information available.
Free Version
Free Trial

Pricing

No information available.
Free Version
Free Trial

Reviews/Ratings

Overall 0.0 / 5
ease 0.0 / 5
features 0.0 / 5
design 0.0 / 5
support 0.0 / 5

This software hasn't been reviewed yet. Be the first to provide a review:

Review this Software

Reviews/Ratings

Overall 0.0 / 5
ease 0.0 / 5
features 0.0 / 5
design 0.0 / 5
support 0.0 / 5

This software hasn't been reviewed yet. Be the first to provide a review:

Review this Software

Training

Documentation
Webinars
Live Online
In Person

Training

Documentation
Webinars
Live Online
In Person

Company Information

Meta
Founded: 2004
United States
www.meta.com

Company Information

Huawei
Founded: 1987
China
huawei.com

Alternatives

Alternatives

LTM-1

LTM-1

Magic AI
T5

T5

Google
PanGu-α

PanGu-α

Huawei
CodeQwen

CodeQwen

Alibaba
DeepSeek-V2

DeepSeek-V2

DeepSeek
PanGu-α

PanGu-α

Huawei
VideoPoet

VideoPoet

Google
Falcon-40B

Falcon-40B

Technology Innovation Institute (TII)
OPT

OPT

Meta

Categories

Categories

Integrations

PanGu Chat

Integrations

PanGu Chat
Claim OPT and update features and information
Claim OPT and update features and information
Claim PanGu-Σ and update features and information
Claim PanGu-Σ and update features and information