Alternatives to Llama

Compare Llama alternatives for your business or organization using the curated list below. SourceForge ranks the best alternatives to Llama in 2026. Compare features, ratings, user reviews, pricing, and more from Llama competitors and alternatives in order to make an informed decision for your business.

  • 1
    Vertex AI
    Build, deploy, and scale machine learning (ML) models faster, with fully managed ML tools for any use case. Through Vertex AI Workbench, Vertex AI is natively integrated with BigQuery, Dataproc, and Spark. You can use BigQuery ML to create and execute machine learning models in BigQuery using standard SQL queries on existing business intelligence tools and spreadsheets, or you can export datasets from BigQuery directly into Vertex AI Workbench and run your models from there. Use Vertex Data Labeling to generate highly accurate labels for your data collection. Vertex AI Agent Builder enables developers to create and deploy enterprise-grade generative AI applications. It offers both no-code and code-first approaches, allowing users to build AI agents using natural language instructions or by leveraging frameworks like LangChain and LlamaIndex.
    Compare vs. Llama View Software
    Visit Website
  • 2
    Gemini

    Gemini

    Google

    Gemini is Google’s advanced AI assistant designed to help users think, create, learn, and complete tasks with a new level of intelligence. Powered by Google’s most capable models, including Gemini 3, it enables users to ask complex questions, generate content, analyze information, and explore ideas through natural conversation. Gemini can create images, videos, summaries, study plans, and first drafts while also providing feedback on uploaded files and written work. The platform is grounded in Google Search, allowing it to deliver accurate, up-to-date information and support deep follow-up questions. Gemini connects seamlessly with Google apps like Gmail, Docs, Calendar, Maps, YouTube, and Photos to help users complete tasks without switching tools. Features such as Gemini Live, Deep Research, and Gems enhance brainstorming, research, and personalized workflows. Available through flexible free and paid plans, Gemini supports everyday users, students, and professionals across devices.
  • 3
    Mistral AI

    Mistral AI

    Mistral AI

    Mistral AI is a pioneering artificial intelligence startup specializing in open-source generative AI. The company offers a range of customizable, enterprise-grade AI solutions deployable across various platforms, including on-premises, cloud, edge, and devices. Flagship products include "Le Chat," a multilingual AI assistant designed to enhance productivity in both personal and professional contexts, and "La Plateforme," a developer platform that enables the creation and deployment of AI-powered applications. Committed to transparency and innovation, Mistral AI positions itself as a leading independent AI lab, contributing significantly to open-source AI and policy development.
  • 4
    Galactica
    Information overload is a major obstacle to scientific progress. The explosive growth in scientific literature and data has made it ever harder to discover useful insights in a large mass of information. Today scientific knowledge is accessed through search engines, but they are unable to organize scientific knowledge alone. Galactica is a large language model that can store, combine and reason about scientific knowledge. We train on a large scientific corpus of papers, reference material, knowledge bases and many other sources. We outperform existing models on a range of scientific tasks. On technical knowledge probes such as LaTeX equations, Galactica outperforms the latest GPT-3 by 68.2% versus 49.0%. Galactica also performs well on reasoning, outperforming Chinchilla on mathematical MMLU by 41.3% to 35.7%, and PaLM 540B on MATH with a score of 20.4% versus 8.8%.
  • 5
    Gemini Nano
    Gemini Nano from Google is a lightweight, energy-efficient AI model designed for high performance in compact, resource-constrained environments. Tailored for edge computing and mobile applications, Gemini Nano combines Google's advanced AI architecture with cutting-edge optimization techniques to deliver seamless performance without compromising speed or accuracy. Despite its compact size, it excels in tasks like voice recognition, natural language processing, real-time translation, and personalized recommendations. With a focus on privacy and efficiency, Gemini Nano processes data locally, minimizing reliance on cloud infrastructure while maintaining robust security. Its adaptability and low power consumption make it an ideal choice for smart devices, IoT ecosystems, and on-the-go AI solutions.
  • 6
    Gemma

    Gemma

    Google

    Gemma is a family of lightweight, state-of-the-art open models built from the same research and technology used to create the Gemini models. Developed by Google DeepMind and other teams across Google, Gemma is inspired by Gemini, and the name reflects the Latin gemma, meaning “precious stone.” Accompanying our model weights, we’re also releasing tools to support developer innovation, foster collaboration, and guide the responsible use of Gemma models. Gemma models share technical and infrastructure components with Gemini, our largest and most capable AI model widely available today. This enables Gemma 2B and 7B to achieve best-in-class performance for their sizes compared to other open models. And Gemma models are capable of running directly on a developer laptop or desktop computer. Notably, Gemma surpasses significantly larger models on key benchmarks while adhering to our rigorous standards for safe and responsible outputs.
  • 7
    GPT-3

    GPT-3

    OpenAI

    Our GPT-3 models can understand and generate natural language. We offer four main models with different levels of power suitable for different tasks. Davinci is the most capable model, and Ada is the fastest. The main GPT-3 models are meant to be used with the text completion endpoint. We also offer models that are specifically meant to be used with other endpoints. Davinci is the most capable model family and can perform any task the other models can perform and often with less instruction. For applications requiring a lot of understanding of the content, like summarization for a specific audience and creative content generation, Davinci is going to produce the best results. These increased capabilities require more compute resources, so Davinci costs more per API call and is not as fast as the other models.
    Starting Price: $0.0200 per 1000 tokens
  • 8
    GPT-3.5

    GPT-3.5

    OpenAI

    GPT-3.5 is the next evolution of GPT 3 large language model from OpenAI. GPT-3.5 models can understand and generate natural language. We offer four main models with different levels of power suitable for different tasks. The main GPT-3.5 models are meant to be used with the text completion endpoint. We also offer models that are specifically meant to be used with other endpoints. Davinci is the most capable model family and can perform any task the other models can perform and often with less instruction. For applications requiring a lot of understanding of the content, like summarization for a specific audience and creative content generation, Davinci is going to produce the best results. These increased capabilities require more compute resources, so Davinci costs more per API call and is not as fast as the other models.
    Starting Price: $0.0200 per 1000 tokens
  • 9
    GPT-4

    GPT-4

    OpenAI

    GPT-4 (Generative Pre-trained Transformer 4) is a large-scale unsupervised language model, yet to be released by OpenAI. GPT-4 is the successor to GPT-3 and part of the GPT-n series of natural language processing models, and was trained on a dataset of 45TB of text to produce human-like text generation and understanding capabilities. Unlike most other NLP models, GPT-4 does not require additional training data for specific tasks. Instead, it can generate text or answer questions using only its own internally generated context as input. GPT-4 has been shown to be able to perform a wide variety of tasks without any task specific training data such as translation, summarization, question answering, sentiment analysis and more.
    Starting Price: $0.0200 per 1000 tokens
  • 10
    GPT-J

    GPT-J

    EleutherAI

    GPT-J is a cutting-edge language model created by the research organization EleutherAI. In terms of performance, GPT-J exhibits a level of proficiency comparable to that of OpenAI's renowned GPT-3 model in a range of zero-shot tasks. Notably, GPT-J has demonstrated the ability to surpass GPT-3 in tasks related to generating code. The latest iteration of this language model, known as GPT-J-6B, is built upon a linguistic dataset referred to as The Pile. This dataset, which is publicly available, encompasses a substantial volume of 825 gibibytes of language data, organized into 22 distinct subsets. While GPT-J shares certain capabilities with ChatGPT, it is important to note that GPT-J is not designed to operate as a chatbot; rather, its primary function is to predict text. In a significant development in March 2023, Databricks introduced Dolly, a model that follows instructions and is licensed under Apache.
    Starting Price: Free
  • 11
    GPT-NeoX

    GPT-NeoX

    EleutherAI

    An implementation of model parallel autoregressive transformers on GPUs, based on the DeepSpeed library. This repository records EleutherAI's library for training large-scale language models on GPUs. Our current framework is based on NVIDIA's Megatron Language Model and has been augmented with techniques from DeepSpeed as well as some novel optimizations. We aim to make this repo a centralized and accessible place to gather techniques for training large-scale autoregressive language models, and accelerate research into large-scale training.
    Starting Price: Free
  • 12
    Falcon-40B

    Falcon-40B

    Technology Innovation Institute (TII)

    Falcon-40B is a 40B parameters causal decoder-only model built by TII and trained on 1,000B tokens of RefinedWeb enhanced with curated corpora. It is made available under the Apache 2.0 license. Why use Falcon-40B? It is the best open-source model currently available. Falcon-40B outperforms LLaMA, StableLM, RedPajama, MPT, etc. See the OpenLLM Leaderboard. It features an architecture optimized for inference, with FlashAttention and multiquery. It is made available under a permissive Apache 2.0 license allowing for commercial use, without any royalties or restrictions. ⚠️ This is a raw, pretrained model, which should be further finetuned for most usecases. If you are looking for a version better suited to taking generic instructions in a chat format, we recommend taking a look at Falcon-40B-Instruct.
    Starting Price: Free
  • 13
    Falcon-7B

    Falcon-7B

    Technology Innovation Institute (TII)

    Falcon-7B is a 7B parameters causal decoder-only model built by TII and trained on 1,500B tokens of RefinedWeb enhanced with curated corpora. It is made available under the Apache 2.0 license. Why use Falcon-7B? It outperforms comparable open-source models (e.g., MPT-7B, StableLM, RedPajama etc.), thanks to being trained on 1,500B tokens of RefinedWeb enhanced with curated corpora. See the OpenLLM Leaderboard. It features an architecture optimized for inference, with FlashAttention and multiquery. It is made available under a permissive Apache 2.0 license allowing for commercial use, without any royalties or restrictions.
    Starting Price: Free
  • 14
    FLAN-T5

    FLAN-T5

    Google

    FLAN-T5 was released in the paper Scaling Instruction-Finetuned Language Models - it is an enhanced version of T5 that has been finetuned in a mixture of tasks.
    Starting Price: Free
  • 15
    Hugging Face

    Hugging Face

    Hugging Face

    Hugging Face is a leading platform for AI and machine learning, offering a vast hub for models, datasets, and tools for natural language processing (NLP) and beyond. The platform supports a wide range of applications, from text, image, and audio to 3D data analysis. Hugging Face fosters collaboration among researchers, developers, and companies by providing open-source tools like Transformers, Diffusers, and Tokenizers. It enables users to build, share, and access pre-trained models, accelerating AI development for a variety of industries.
    Starting Price: $9 per month
  • 16
    ERNIE Bot
    ERNIE Bot is an AI-powered conversational assistant developed by Baidu, designed to facilitate seamless and natural interactions with users. Built on the ERNIE (Enhanced Representation through Knowledge Integration) model, ERNIE Bot excels at understanding complex queries and generating human-like responses across various domains. Its capabilities include processing text, generating images, and engaging in multimodal communication, making it suitable for a wide range of applications such as customer support, virtual assistants, and enterprise automation. With its advanced contextual understanding, ERNIE Bot offers an intuitive and efficient solution for businesses seeking to enhance their digital interactions and automate workflows.
    Starting Price: Free
  • 17
    Dolly

    Dolly

    Databricks

    Dolly is a cheap-to-build LLM that exhibits a surprising degree of the instruction following capabilities exhibited by ChatGPT. Whereas the work from the Alpaca team showed that state-of-the-art models could be coaxed into high quality instruction-following behavior, we find that even years-old open source models with much earlier architectures exhibit striking behaviors when fine tuned on a small corpus of instruction training data. Dolly works by taking an existing open source 6 billion parameter model from EleutherAI and modifying it ever so slightly to elicit instruction following capabilities such as brainstorming and text generation not present in the original model, using data from Alpaca.
    Starting Price: Free
  • 18
    Hermes 4

    Hermes 4

    Nous Research

    Hermes 4 is the latest evolution in Nous Research’s line of neutrally aligned, steerable foundational models, featuring novel hybrid reasoners that can dynamically shift between expressive, creative responses and efficient, standard replies based on user prompts. The model is designed to respond to system and user instructions, rather than adhering to any corporate ethics framework, producing interactions that feel more humanistic, less lecturing or sycophantic, and encouraging roleplay and creativity. By incorporating a special tag in prompts, users can trigger deeper, internally token-intensive reasoning when tackling complex problems, while retaining prompt efficiency when such depth isn't required. Trained on a dataset 50 times larger than that of Hermes 3, much of which was synthetically generated using Atropos, Hermes 4 shows significant performance improvements.
    Starting Price: Free
  • 19
    Meta AI
    Meta AI is an intelligent assistant that is capable of complex reasoning, following instructions, visualizing ideas, and solving nuanced problems. Meta AI is an intelligent assistant built on Meta's most advanced model. It is designed to answer any question you might have, help with writing, provide step-by-step advice, and create images to share with friends. It is available within Meta's family of apps, smart glasses, and web platforms.
  • 20
    Llama 2
    The next generation of our open source large language model. This release includes model weights and starting code for pretrained and fine-tuned Llama language models — ranging from 7B to 70B parameters. Llama 2 pretrained models are trained on 2 trillion tokens, and have double the context length than Llama 1. Its fine-tuned models have been trained on over 1 million human annotations. Llama 2 outperforms other open source language models on many external benchmarks, including reasoning, coding, proficiency, and knowledge tests. Llama 2 was pretrained on publicly available online data sources. The fine-tuned model, Llama-2-chat, leverages publicly available instruction datasets and over 1 million human annotations. We have a broad range of supporters around the world who believe in our open approach to today’s AI — companies that have given early feedback and are excited to build with Llama 2.
    Starting Price: Free
  • 21
    MPT-7B

    MPT-7B

    MosaicML

    Introducing MPT-7B, the latest entry in our MosaicML Foundation Series. MPT-7B is a transformer trained from scratch on 1T tokens of text and code. It is open source, available for commercial use, and matches the quality of LLaMA-7B. MPT-7B was trained on the MosaicML platform in 9.5 days with zero human intervention at a cost of ~$200k. Now you can train, finetune, and deploy your own private MPT models, either starting from one of our checkpoints or training from scratch. For inspiration, we are also releasing three finetuned models in addition to the base MPT-7B: MPT-7B-Instruct, MPT-7B-Chat, and MPT-7B-StoryWriter-65k+, the last of which uses a context length of 65k tokens!
    Starting Price: Free
  • 22
    MosaicML

    MosaicML

    MosaicML

    Train and serve large AI models at scale with a single command. Point to your S3 bucket and go. We handle the rest, orchestration, efficiency, node failures, and infrastructure. Simple and scalable. MosaicML enables you to easily train and deploy large AI models on your data, in your secure environment. Stay on the cutting edge with our latest recipes, techniques, and foundation models. Developed and rigorously tested by our research team. With a few simple steps, deploy inside your private cloud. Your data and models never leave your firewalls. Start in one cloud, and continue on another, without skipping a beat. Own the model that's trained on your own data. Introspect and better explain the model decisions. Filter the content and data based on your business needs. Seamlessly integrate with your existing data pipelines, experiment trackers, and other tools. We are fully interoperable, cloud-agnostic, and enterprise proved.
  • 23
    OpenEuroLLM

    OpenEuroLLM

    OpenEuroLLM

    OpenEuroLLM is a collaborative initiative among Europe's leading AI companies and research institutions to develop a series of open-source foundation models for transparent AI in Europe. The project emphasizes transparency by openly sharing data, documentation, training, testing code, and evaluation metrics, fostering community involvement. It ensures compliance with EU regulations, aiming to provide performant large language models that align with European standards. A key focus is on linguistic and cultural diversity, extending multilingual capabilities to encompass all EU official languages and beyond. The initiative seeks to enhance access to foundational models ready for fine-tuning across various applications, expand evaluation results in multiple languages, and increase the availability of training datasets and benchmarks. Transparency is maintained throughout the training processes by sharing tools, methodologies, and intermediate results.
  • 24
    OpenLLaMA

    OpenLLaMA

    OpenLLaMA

    OpenLLaMA is a permissively licensed open source reproduction of Meta AI’s LLaMA 7B trained on the RedPajama dataset. Our model weights can serve as the drop in replacement of LLaMA 7B in existing implementations. We also provide a smaller 3B variant of LLaMA model.
    Starting Price: Free
  • 25
    Ollama

    Ollama

    Ollama

    Ollama is an innovative platform that focuses on providing AI-powered tools and services, designed to make it easier for users to interact with and build AI-driven applications. Run AI models locally. By offering a range of solutions, including natural language processing models and customizable AI features, Ollama empowers developers, businesses, and organizations to integrate advanced machine learning technologies into their workflows. With an emphasis on usability and accessibility, Ollama strives to simplify the process of working with AI, making it an appealing option for those looking to harness the potential of artificial intelligence in their projects.
    Starting Price: Free
  • 26
    Megatron-Turing
    Megatron-Turing Natural Language Generation model (MT-NLG), is the largest and the most powerful monolithic transformer English language model with 530 billion parameters. This 105-layer, transformer-based MT-NLG improves upon the prior state-of-the-art models in zero-, one-, and few-shot settings. It demonstrates unmatched accuracy in a broad set of natural language tasks such as, Completion prediction, Reading comprehension, Commonsense reasoning, Natural language inferences, Word sense disambiguation, etc. With the intent of accelerating research on the largest English language model till date and enabling customers to experiment, employ and apply such a large language model on downstream language tasks - NVIDIA is pleased to announce an Early Access program for its managed API service to MT-NLG mode.
  • 27
    Pythia

    Pythia

    EleutherAI

    Pythia combines interpretability analysis and scaling laws to understand how knowledge develops and evolves during training in autoregressive transformers.
    Starting Price: Free
  • 28
    SmolLM2

    SmolLM2

    Hugging Face

    SmolLM2 is a collection of state-of-the-art, compact language models developed for on-device applications. The models in this collection range from 1.7B parameters to smaller 360M and 135M versions, designed to perform efficiently even on less powerful hardware. These models excel in text generation tasks and are optimized for real-time, low-latency applications, providing high-quality results across various use cases, including content creation, coding assistance, and natural language processing. SmolLM2's flexibility makes it a suitable choice for developers looking to integrate powerful AI into mobile devices, edge computing, and other resource-constrained environments.
    Starting Price: Free
  • 29
    T5

    T5

    Google

    With T5, we propose reframing all NLP tasks into a unified text-to-text-format where the input and output are always text strings, in contrast to BERT-style models that can only output either a class label or a span of the input. Our text-to-text framework allows us to use the same model, loss function, and hyperparameters on any NLP task, including machine translation, document summarization, question answering, and classification tasks (e.g., sentiment analysis). We can even apply T5 to regression tasks by training it to predict the string representation of a number instead of the number itself.
  • 30
    Stable Beluga

    Stable Beluga

    Stability AI

    Stability AI and its CarperAI lab proudly announce Stable Beluga 1 and its successor Stable Beluga 2 (formerly codenamed FreeWilly), two powerful new, open access, Large Language Models (LLMs). Both models demonstrate exceptional reasoning ability across varied benchmarks. Stable Beluga 1 leverages the original LLaMA 65B foundation model and was carefully fine-tuned with a new synthetically-generated dataset using Supervised Fine-Tune (SFT) in standard Alpaca format. Similarly, Stable Beluga 2 leverages the LLaMA 2 70B foundation model to achieve industry-leading performance.
    Starting Price: Free
  • 31
    Stable LM

    Stable LM

    Stability AI

    Stable LM: Stability AI Language Models. The release of Stable LM builds on our experience in open-sourcing earlier language models with EleutherAI, a nonprofit research hub. These language models include GPT-J, GPT-NeoX, and the Pythia suite, which were trained on The Pile open-source dataset. Many recent open-source language models continue to build on these efforts, including Cerebras-GPT and Dolly-2. Stable LM is trained on a new experimental dataset built on The Pile, but three times larger with 1.5 trillion tokens of content. We will release details on the dataset in due course. The richness of this dataset gives Stable LM surprisingly high performance in conversational and coding tasks, despite its small size of 3 to 7 billion parameters (by comparison, GPT-3 has 175 billion parameters). Stable LM 3B is a compact language model designed to operate on portable digital devices like handhelds and laptops, and we’re excited about its capabilities and portability.
    Starting Price: Free
  • 32
    PaLM

    PaLM

    Google

    PaLM API is an easy and safe way to build on top of our best language models. Today, we’re making an efficient model available, in terms of size and capabilities, and we’ll add other sizes soon. The API also comes with an intuitive tool called MakerSuite, which lets you quickly prototype ideas and, over time, will have features for prompt engineering, synthetic data generation and custom-model tuning — all supported by robust safety tools. Select developers can access the PaLM API and MakerSuite in Private Preview today, and stay tuned for our waitlist soon.
  • 33
    PanGu-α

    PanGu-α

    Huawei

    PanGu-α is developed under the MindSpore and trained on a cluster of 2048 Ascend 910 AI processors. The training parallelism strategy is implemented based on MindSpore Auto-parallel, which composes five parallelism dimensions to scale the training task to 2048 processors efficiently, including data parallelism, op-level model parallelism, pipeline model parallelism, optimizer model parallelism and rematerialization. To enhance the generalization ability of PanGu-α, we collect 1.1TB high-quality Chinese data from a wide range of domains to pretrain the model. We empirically test the generation ability of PanGu-α in various scenarios including text summarization, question answering, dialogue generation, etc. Moreover, we investigate the effect of model scales on the few-shot performances across a broad range of Chinese NLP tasks. The experimental results demonstrate the superior capabilities of PanGu-α in performing various tasks under few-shot or zero-shot settings.
  • 34
    OPT

    OPT

    Meta

    Large language models, which are often trained for hundreds of thousands of compute days, have shown remarkable capabilities for zero- and few-shot learning. Given their computational cost, these models are difficult to replicate without significant capital. For the few that are available through APIs, no access is granted to the full model weights, making them difficult to study. We present Open Pre-trained Transformers (OPT), a suite of decoder-only pre-trained transformers ranging from 125M to 175B parameters, which we aim to fully and responsibly share with interested researchers. We show that OPT-175B is comparable to GPT-3, while requiring only 1/7th the carbon footprint to develop. We are also releasing our logbook detailing the infrastructure challenges we faced, along with code for experimenting with all of the released models.
  • 35
    Sonar

    Sonar

    Perplexity

    Perplexity has recently introduced an enhanced version of its AI search engine, named Sonar. Built upon the Llama 3.3 70B model, Sonar has undergone additional training to improve the factual accuracy and readability of responses in Perplexity's default search mode. This advancement aims to deliver users more precise and comprehensible answers while maintaining the platform's characteristic efficiency and speed. Sonar also provides real-time, web-wide research and Q&A capabilities, allowing developers to integrate these features into their products through a lightweight, cost-effective, and user-friendly API. The Sonar API supports advanced models like sonar-reasoning-pro and sonar-pro, designed for complex tasks requiring deep understanding and context retention. These models offer detailed answers with an average of twice as many citations as previous versions, enhancing the transparency and reliability of the information provided.
    Starting Price: Free
  • 36
    Vicuna

    Vicuna

    lmsys.org

    Vicuna-13B is an open-source chatbot trained by fine-tuning LLaMA on user-shared conversations collected from ShareGPT. Preliminary evaluation using GPT-4 as a judge shows Vicuna-13B achieves more than 90%* quality of OpenAI ChatGPT and Google Bard while outperforming other models like LLaMA and Stanford Alpaca in more than 90%* of cases. The cost of training Vicuna-13B is around $300. The code and weights, along with an online demo, are publicly available for non-commercial use.
    Starting Price: Free
  • 37
    RedPajama

    RedPajama

    RedPajama

    Foundation models such as GPT-4 have driven rapid improvement in AI. However, the most powerful models are closed commercial models or only partially open. RedPajama is a project to create a set of leading, fully open-source models. Today, we are excited to announce the completion of the first step of this project: the reproduction of the LLaMA training dataset of over 1.2 trillion tokens. The most capable foundation models today are closed behind commercial APIs, which limits research, customization, and their use with sensitive data. Fully open-source models hold the promise of removing these limitations, if the open community can close the quality gap between open and closed models. Recently, there has been much progress along this front. In many ways, AI is having its Linux moment. Stable Diffusion showed that open-source can not only rival the quality of commercial offerings like DALL-E but can also lead to incredible creativity from broad participation by communities.
    Starting Price: Free
  • 38
    Teuken 7B

    Teuken 7B

    OpenGPT-X

    Teuken-7B is a multilingual, open source language model developed under the OpenGPT-X initiative, specifically designed to cater to Europe's diverse linguistic landscape. It has been trained on a dataset comprising over 50% non-English texts, encompassing all 24 official languages of the European Union, ensuring robust performance across these languages. A key innovation in Teuken-7B is its custom multilingual tokenizer, optimized for European languages, which enhances training efficiency and reduces inference costs compared to standard monolingual tokenizers. The model is available in two versions, Teuken-7B-Base, the foundational pre-trained model, and Teuken-7B-Instruct, which has undergone instruction tuning for improved performance in following user prompts. Both versions are accessible on Hugging Face, promoting transparency and collaboration within the AI community. The development of Teuken-7B underscores a commitment to creating AI models that reflect Europe's diversity.
    Starting Price: Free
  • 39
    Qwen

    Qwen

    Alibaba

    Qwen is a powerful, free AI assistant built on the advanced Qwen model series, designed to help anyone with creativity, research, problem-solving, and everyday tasks. While Qwen Chat is the main interface for most users, Qwen itself powers a broad range of intelligent capabilities including image generation, deep research, website creation, advanced reasoning, and context-aware search. Its multimodal intelligence enables Qwen to understand and process text, images, audio, and video simultaneously for richer insights. Qwen is available on web, desktop, and mobile, ensuring seamless access across all devices. For developers, the Qwen API provides OpenAI-compatible endpoints, making integration simple and allowing Qwen’s intelligence to power apps, services, and automation. Whether you're chatting through Qwen Chat or building with the Qwen API, Qwen delivers fast, flexible, and highly capable AI support.
  • 40
    mT5

    mT5

    Google

    Multilingual T5 (mT5) is a massively multilingual pretrained text-to-text transformer model, trained following a similar recipe as T5. This repo can be used to reproduce the experiments in the mT5 paper. mT5 is pretrained on the mC4 corpus, covering 101 languages: Afrikaans, Albanian, Amharic, Arabic, Armenian, Azerbaijani, Basque, Belarusian, Bengali, Bulgarian, Burmese, Catalan, Cebuano, Chichewa, Chinese, Corsican, Czech, Danish, Dutch, English, Esperanto, Estonian, Filipino, Finnish, French, Galician, Georgian, German, Greek, Gujarati, Haitian Creole, Hausa, Hawaiian, Hebrew, Hindi, Hmong, Hungarian, Icelandic, Igbo, Indonesian, Irish, Italian, Japanese, Javanese, Kannada, Kazakh, Khmer, Korean, Kurdish, Kyrgyz, Lao, Latin, Latvian, Lithuanian, Luxembourgish, Macedonian, Malagasy, Malay, Malayalam, Maltese, Maori, Marathi, Mongolian, Nepali, Norwegian, Pashto, Persian, Polish, Portuguese, Punjabi, Romanian, Russian, Samoan, Scottish Gaelic, Serbian, Shona, Sindhi, and more.
    Starting Price: Free
  • 41
    BitNet

    BitNet

    Microsoft

    The BitNet b1.58 2B4T is a cutting-edge 1-bit Large Language Model (LLM) developed by Microsoft, designed to enhance computational efficiency while maintaining high performance. This model, built with approximately 2 billion parameters and trained on 4 trillion tokens, uses innovative quantization techniques to optimize memory usage, energy consumption, and latency. The platform supports multiple modalities and is particularly valuable for applications in AI-powered text generation, offering substantial efficiency gains compared to full-precision models.
    Starting Price: Free
  • 42
    Alpaca

    Alpaca

    Stanford Center for Research on Foundation Models (CRFM)

    Instruction-following models such as GPT-3.5 (text-DaVinci-003), ChatGPT, Claude, and Bing Chat have become increasingly powerful. Many users now interact with these models regularly and even use them for work. However, despite their widespread deployment, instruction-following models still have many deficiencies: they can generate false information, propagate social stereotypes, and produce toxic language. To make maximum progress on addressing these pressing problems, it is important for the academic community to engage. Unfortunately, doing research on instruction-following models in academia has been difficult, as there is no easily accessible model that comes close in capabilities to closed-source models such as OpenAI’s text-DaVinci-003. We are releasing our findings about an instruction-following language model, dubbed Alpaca, which is fine-tuned from Meta’s LLaMA 7B model.
  • 43
    Chinchilla

    Chinchilla

    Google DeepMind

    Chinchilla is a large language model. Chinchilla uses the same compute budget as Gopher but with 70B parameters and 4× more more data. Chinchilla uniformly and significantly outperforms Gopher (280B), GPT-3 (175B), Jurassic-1 (178B), and Megatron-Turing NLG (530B) on a large range of downstream evaluation tasks. This also means that Chinchilla uses substantially less compute for fine-tuning and inference, greatly facilitating downstream usage. As a highlight, Chinchilla reaches a state-of-the-art average accuracy of 67.5% on the MMLU benchmark, greater than a 7% improvement over Gopher.
  • 44
    Cerebras-GPT
    State-of-the-art language models are extremely challenging to train; they require huge compute budgets, complex distributed compute techniques and deep ML expertise. As a result, few organizations train large language models (LLMs) from scratch. And increasingly those that have the resources and expertise are not open sourcing the results, marking a significant change from even a few months back. At Cerebras, we believe in fostering open access to the most advanced models. With this in mind, we are proud to announce the release to the open source community of Cerebras-GPT, a family of seven GPT models ranging from 111 million to 13 billion parameters. Trained using the Chinchilla formula, these models provide the highest accuracy for a given compute budget. Cerebras-GPT has faster training times, lower training costs, and consumes less energy than any publicly available model to date.
    Starting Price: Free
  • 45
    Cohere

    Cohere

    Cohere AI

    Cohere is an enterprise AI platform that enables developers and businesses to build powerful language-based applications. Specializing in large language models (LLMs), Cohere provides solutions for text generation, summarization, and semantic search. Their model offerings include the Command family for high-performance language tasks and Aya Expanse for multilingual applications across 23 languages. Focused on security and customization, Cohere allows flexible deployment across major cloud providers, private cloud environments, or on-premises setups to meet diverse enterprise needs. The company collaborates with industry leaders like Oracle and Salesforce to integrate generative AI into business applications, improving automation and customer engagement. Additionally, Cohere For AI, their research lab, advances machine learning through open-source projects and a global research community.
  • 46
    Code Llama
    Code Llama is a large language model (LLM) that can use text prompts to generate code. Code Llama is state-of-the-art for publicly available LLMs on code tasks, and has the potential to make workflows faster and more efficient for current developers and lower the barrier to entry for people who are learning to code. Code Llama has the potential to be used as a productivity and educational tool to help programmers write more robust, well-documented software. Code Llama is a state-of-the-art LLM capable of generating code, and natural language about code, from both code and natural language prompts. Code Llama is free for research and commercial use. Code Llama is built on top of Llama 2 and is available in three models: Code Llama, the foundational code model; Codel Llama - Python specialized for Python; and Code Llama - Instruct, which is fine-tuned for understanding natural language instructions.
    Starting Price: Free
  • 47
    Llama 3.3
    Llama 3.3 is the latest iteration in the Llama series of language models, developed to push the boundaries of AI-powered understanding and communication. With enhanced contextual reasoning, improved language generation, and advanced fine-tuning capabilities, Llama 3.3 is designed to deliver highly accurate, human-like responses across diverse applications. This version features a larger training dataset, refined algorithms for nuanced comprehension, and reduced biases compared to its predecessors. Llama 3.3 excels in tasks such as natural language understanding, creative writing, technical explanation, and multilingual communication, making it an indispensable tool for businesses, developers, and researchers. Its modular architecture allows for customizable deployment in specialized domains, ensuring versatility and performance at scale.
    Starting Price: Free
  • 48
    LongLLaMA

    LongLLaMA

    LongLLaMA

    This repository contains the research preview of LongLLaMA, a large language model capable of handling long contexts of 256k tokens or even more. LongLLaMA is built upon the foundation of OpenLLaMA and fine-tuned using the Focused Transformer (FoT) method. LongLLaMA code is built upon the foundation of Code Llama. We release a smaller 3B base variant (not instruction tuned) of the LongLLaMA model on a permissive license (Apache 2.0) and inference code supporting longer contexts on hugging face. Our model weights can serve as the drop-in replacement of LLaMA in existing implementations (for short context up to 2048 tokens). Additionally, we provide evaluation results and comparisons against the original OpenLLaMA models.
    Starting Price: Free
  • 49
    Llama 3.2
    The open-source AI model you can fine-tune, distill and deploy anywhere is now available in more versions. Choose from 1B, 3B, 11B or 90B, or continue building with Llama 3.1. Llama 3.2 is a collection of large language models (LLMs) pretrained and fine-tuned in 1B and 3B sizes that are multilingual text only, and 11B and 90B sizes that take both text and image inputs and output text. Develop highly performative and efficient applications from our latest release. Use our 1B or 3B models for on device applications such as summarizing a discussion from your phone or calling on-device tools like calendar. Use our 11B or 90B models for image use cases such as transforming an existing image into something new or getting more information from an image of your surroundings.
    Starting Price: Free
  • 50
    Llama 4 Behemoth
    Llama 4 Behemoth is Meta's most powerful AI model to date, featuring a massive 288 billion active parameters. It excels in multimodal tasks, outperforming previous models like GPT-4.5 and Gemini 2.0 Pro across multiple STEM-focused benchmarks such as MATH-500 and GPQA Diamond. As the teacher model for the Llama 4 series, Behemoth sets the foundation for models like Llama 4 Maverick and Llama 4 Scout. While still in training, Llama 4 Behemoth demonstrates unmatched intelligence, pushing the boundaries of AI in fields like math, multilinguality, and image understanding.
    Starting Price: Free