Compare the Top AI Models as of August 2025 - Page 9

AI Models Clear Filters
  • 1
    Gemma

    Gemma

    Google

    Gemma is a family of lightweight, state-of-the-art open models built from the same research and technology used to create the Gemini models. Developed by Google DeepMind and other teams across Google, Gemma is inspired by Gemini, and the name reflects the Latin gemma, meaning “precious stone.” Accompanying our model weights, we’re also releasing tools to support developer innovation, foster collaboration, and guide the responsible use of Gemma models. Gemma models share technical and infrastructure components with Gemini, our largest and most capable AI model widely available today. This enables Gemma 2B and 7B to achieve best-in-class performance for their sizes compared to other open models. And Gemma models are capable of running directly on a developer laptop or desktop computer. Notably, Gemma surpasses significantly larger models on key benchmarks while adhering to our rigorous standards for safe and responsible outputs.
  • 2
    Eternity AI

    Eternity AI

    Eternity AI

    Eternity AI is building an HTLM-7B, a machine learning model that knows what the internet is and how to access it to generate responses. Humans don't make decisions based on 2-year-old data. For a model to think like a human, it needs to get access to real-time knowledge and everything about how humans behave. Members of our team have previously published white papers and articles on topics related to on-chain vulnerability coordination, GPT database retrieval, decentralized dispute resolution, etc.
  • 3
    Adept

    Adept

    Adept

    Adept is an ML research and product lab building general intelligence by enabling humans and computers to work together creatively. Designed and trained specifically for taking actions on computers in response to your natural language commands. ACT-1 is our first step towards a foundation model that can use every software tool, API and website that exists. Adept is building an entirely new way to get things done. It takes your goals, in plain language, and turns them into actions on the software you use every day. We believe that AI systems should be built with users at the center — where machines work together with people in the driver's seat, discovering new solutions, enabling more informed decisions, and giving us more time for the work we love.
  • 4
    DBRX

    DBRX

    Databricks

    Today, we are excited to introduce DBRX, an open, general-purpose LLM created by Databricks. Across a range of standard benchmarks, DBRX sets a new state-of-the-art for established open LLMs. Moreover, it provides the open community and enterprises building their own LLMs with capabilities that were previously limited to closed model APIs; according to our measurements, it surpasses GPT-3.5, and it is competitive with Gemini 1.0 Pro. It is an especially capable code model, surpassing specialized models like CodeLLaMA-70B in programming, in addition to its strength as a general-purpose LLM. This state-of-the-art quality comes with marked improvements in training and inference performance. DBRX advances the state-of-the-art in efficiency among open models thanks to its fine-grained mixture-of-experts (MoE) architecture. Inference is up to 2x faster than LLaMA2-70B, and DBRX is about 40% of the size of Grok-1 in terms of both total and active parameter counts.
  • 5
    Claude Haiku 3
    Claude Haiku 3 is the fastest and most affordable model in its intelligence class. With state-of-the-art vision capabilities and strong performance on industry benchmarks, Haiku is a versatile solution for a wide range of enterprise applications. The model is now available alongside Sonnet and Opus in the Claude API and on claude.ai for our Claude Pro subscribers.
  • 6
    Gemma 2

    Gemma 2

    Google

    A family of state-of-the-art, light-open models created from the same research and technology that were used to create Gemini models. These models incorporate comprehensive security measures and help ensure responsible and reliable AI solutions through selected data sets and rigorous adjustments. Gemma models achieve exceptional comparative results in their 2B, 7B, 9B, and 27B sizes, even outperforming some larger open models. With Keras 3.0, enjoy seamless compatibility with JAX, TensorFlow, and PyTorch, allowing you to effortlessly choose and change frameworks based on task. Redesigned to deliver outstanding performance and unmatched efficiency, Gemma 2 is optimized for incredibly fast inference on various hardware. The Gemma family of models offers different models that are optimized for specific use cases and adapt to your needs. Gemma models are large text-to-text lightweight language models with a decoder, trained in a huge set of text data, code, and mathematical content.
  • 7
    Moshi

    Moshi

    Kyutai

    Moshi is an experimental conversational AI. Moshi thinks and speaks at the same time. Moshi can listen and talk at all time: maximum flow between you and Moshi.
    Starting Price: Free
  • 8
    Phi-3

    Phi-3

    Microsoft

    A family of powerful, small language models (SLMs) with groundbreaking performance at low cost and low latency. Maximize AI capabilities, lower resource use, and ensure cost-effective generative AI deployments across your applications. Accelerate response times in real-time interactions, autonomous systems, apps requiring low latency, and other critical scenarios. Run Phi-3 in the cloud, at the edge, or on device, resulting in greater deployment and operation flexibility. Phi-3 models were developed in accordance with Microsoft AI principles: accountability, transparency, fairness, reliability and safety, privacy and security, and inclusiveness. Operate effectively in offline environments where data privacy is paramount or connectivity is limited. Generate more coherent, accurate, and contextually relevant outputs with an expanded context window. Deploy at the edge to deliver faster responses.
  • 9
    NVIDIA Nemotron
    NVIDIA Nemotron is a family of open-source models developed by NVIDIA, designed to generate synthetic data for training large language models (LLMs) for commercial applications. The Nemotron-4 340B model, in particular, is a significant release by NVIDIA, offering developers a powerful tool to generate high-quality data and filter it based on various attributes using a reward model.
  • 10
    Jamba

    Jamba

    AI21 Labs

    Jamba is the most powerful & efficient long context model, open for builders and built for the enterprise. Jamba's latency outperforms all leading models of comparable sizes. Jamba's 256k context window is the longest openly available. Jamba's Mamba-Transformer MoE architecture is designed for cost & efficiency gains. Jamba offers key features of OOTB including function calls, JSON mode output, document objects, and citation mode. Jamba 1.5 models maintain high performance across the full length of their context window. Jamba 1.5 models achieve top scores across common quality benchmarks. Secure deployment that suits your enterprise. Seamlessly start using Jamba on our production-grade SaaS platform. The Jamba model family is available for deployment across our strategic partners. We offer VPC & on-prem deployments for enterprises that require custom solutions. For enterprises that have unique, bespoke requirements, we offer hands-on management, continuous pre-training, etc.
  • 11
    DataGemma
    DataGemma represents a pioneering effort by Google to enhance the accuracy and reliability of large language models (LLMs) when dealing with statistical and numerical data. Launched as a set of open models, DataGemma leverages Google's Data Commons, a vast repository of public statistical data—to ground its responses in real-world facts. This initiative employs two innovative approaches: Retrieval Interleaved Generation (RIG) and Retrieval Augmented Generation (RAG). The RIG method integrates real-time data checks during the generation process to ensure factual accuracy, while RAG retrieves relevant information before generating responses, thereby reducing the likelihood of AI hallucinations. By doing so, DataGemma aims to provide users with more trustworthy and factually grounded answers, marking a significant step towards mitigating the issue of misinformation in AI-generated content.
  • 12
    LFM-40B

    LFM-40B

    Liquid AI

    LFM-40B offers a new balance between model size and output quality. It leverages 12B activated parameters at use. Its performance is comparable to models larger than itself, while its MoE architecture enables higher throughput and deployment on more cost-effective hardware.
  • 13
    LFM-3B

    LFM-3B

    Liquid AI

    LFM-3B delivers incredible performance for its size. It positions itself as first place among 3B parameter transformers, hybrids, and RNN models, but also outperforms the previous generation of 7B and 13B models. It is also on par with Phi-3.5-mini on multiple benchmarks, while being 18.4% smaller. LFM-3B is the ideal choice for mobile and other edge text-based applications.
  • 14
    OLMo 2
    OLMo 2 is a family of fully open language models developed by the Allen Institute for AI (AI2), designed to provide researchers and developers with transparent access to training data, open-source code, reproducible training recipes, and comprehensive evaluations. These models are trained on up to 5 trillion tokens and are competitive with leading open-weight models like Llama 3.1 on English academic benchmarks. OLMo 2 emphasizes training stability, implementing techniques to prevent loss spikes during long training runs, and utilizes staged training interventions during late pretraining to address capability deficiencies. The models incorporate state-of-the-art post-training methodologies from AI2's Tülu 3, resulting in the creation of OLMo 2-Instruct models. An actionable evaluation framework, the Open Language Modeling Evaluation System (OLMES), was established to guide improvements through development stages, consisting of 20 evaluation benchmarks assessing core capabilities.
  • 15
    Amazon Nova
    Amazon Nova is a new generation of state-of-the-art (SOTA) foundation models (FMs) that deliver frontier intelligence and industry leading price-performance, available exclusively on Amazon Bedrock. Amazon Nova Micro, Amazon Nova Lite, and Amazon Nova Pro are understanding models that accept text, image, or video inputs and generate text output. They provide a broad selection of capability, accuracy, speed, and cost operation points. Amazon Nova Micro is a text only model that delivers the lowest latency responses at very low cost. Amazon Nova Lite is a very low-cost multimodal model that is lightning fast for processing image, video, and text inputs. Amazon Nova Pro is a highly capable multimodal model with the best combination of accuracy, speed, and cost for a wide range of tasks. Amazon Nova Pro’s capabilities, coupled with its industry-leading speed and cost efficiency, makes it a compelling model for almost any task, including video summarization, Q&A, math & more.
  • 16
    Phi-4

    Phi-4

    Microsoft

    Phi-4 is a 14B parameter state-of-the-art small language model (SLM) that excels at complex reasoning in areas such as math, in addition to conventional language processing. Phi-4 is the latest member of our Phi family of small language models and demonstrates what’s possible as we continue to probe the boundaries of SLMs. Phi-4 is currently available on Azure AI Foundry under a Microsoft Research License Agreement (MSRLA) and will be available on Hugging Face. Phi-4 outperforms comparable and larger models on math related reasoning due to advancements throughout the processes, including the use of high-quality synthetic datasets, curation of high-quality organic data, and post-training innovations. Phi-4 continues to push the frontier of size vs quality.
  • 17
    FLUX1.1 Pro

    FLUX1.1 Pro

    Black Forest Labs

    The FLUX1.1 Pro from Black Forest Labs sets a new benchmark in AI-powered image generation, delivering remarkable improvements in both speed and quality. This next-gen model outperforms its predecessor, FLUX.1 Pro, by being six times faster while enhancing image fidelity, prompt accuracy, and creative diversity. Key innovations include ultra-high-resolution rendering up to 4K and a Raw Mode for more natural, organic visuals. Available via the BFL API and integrated with platforms like Replicate and Freepik, FLUX1.1 Pro is the ultimate solution for professionals seeking advanced, scalable AI-generated imagery.
    Starting Price: Free
  • 18
    Yi-Lightning

    Yi-Lightning

    Yi-Lightning

    Yi-Lightning, developed by 01.AI under the leadership of Kai-Fu Lee, represents the latest advancement in large language models with a focus on high performance and cost-efficiency. It boasts a maximum context length of 16K tokens and is priced at $0.14 per million tokens for both input and output, making it remarkably competitive. Yi-Lightning leverages an enhanced Mixture-of-Experts (MoE) architecture, incorporating fine-grained expert segmentation and advanced routing strategies, which contribute to its efficiency in training and inference. This model has excelled in various domains, achieving top rankings in categories like Chinese, math, coding, and hard prompts on the chatbot arena, where it secured the 6th position overall and 9th in style control. Its development included comprehensive pre-training, supervised fine-tuning, and reinforcement learning from human feedback, ensuring both performance and safety, with optimizations in memory usage and inference speed.
  • 19
    OpenEuroLLM

    OpenEuroLLM

    OpenEuroLLM

    OpenEuroLLM is a collaborative initiative among Europe's leading AI companies and research institutions to develop a series of open-source foundation models for transparent AI in Europe. The project emphasizes transparency by openly sharing data, documentation, training, testing code, and evaluation metrics, fostering community involvement. It ensures compliance with EU regulations, aiming to provide performant large language models that align with European standards. A key focus is on linguistic and cultural diversity, extending multilingual capabilities to encompass all EU official languages and beyond. The initiative seeks to enhance access to foundational models ready for fine-tuning across various applications, expand evaluation results in multiple languages, and increase the availability of training datasets and benchmarks. Transparency is maintained throughout the training processes by sharing tools, methodologies, and intermediate results.
  • 20
    Gemini 2.0 Flash Thinking
    Gemini 2.0 Flash Thinking is an advanced AI model developed by Google DeepMind, designed to enhance reasoning capabilities by explicitly displaying its thought processes. This transparency allows the model to tackle complex problems more effectively and provides users with clear explanations of its decision-making steps. By showcasing its internal reasoning, Gemini 2.0 Flash Thinking not only improves performance but also offers greater explainability, making it a valuable tool for applications requiring deep understanding and trust in AI-driven solutions.
  • 21
    Gemini 2.0 Flash-Lite
    Gemini 2.0 Flash-Lite is Google DeepMind's lighter AI model, designed to offer a cost-effective solution without compromising performance. As the most economical model in the Gemini 2.0 lineup, Flash-Lite is tailored for developers and businesses seeking efficient AI capabilities at a lower cost. It supports multimodal inputs and features a context window of one million tokens, making it suitable for a variety of applications. Flash-Lite is currently available in public preview, allowing users to explore its potential in enhancing their AI-driven projects.
  • 22
    Gemini 2.0 Pro
    Gemini 2.0 Pro is Google DeepMind's most advanced AI model, designed to excel in complex tasks such as coding and intricate problem-solving. Currently in its experimental phase, it features an extensive context window of two million tokens, enabling it to process and analyze vast amounts of information efficiently. A standout feature of Gemini 2.0 Pro is its seamless integration with external tools like Google Search and code execution environments, enhancing its ability to provide accurate and comprehensive responses. This model represents a significant advancement in AI capabilities, offering developers and users a powerful resource for tackling sophisticated challenges.
  • 23
    Muse

    Muse

    Microsoft

    Microsoft has unveiled Muse, a groundbreaking generative AI model designed to revolutionize gameplay ideation. Developed in collaboration with Ninja Theory, Muse is a World and Human Action Model (WHAM) trained on data from the game Bleeding Edge. This AI model possesses a comprehensive understanding of 3D game environments, including physics and player interactions, enabling it to generate consistent and diverse gameplay sequences. Muse can produce game visuals and predict controller actions, facilitating rapid prototyping and creative exploration for game developers. By analyzing over 1 billion images and actions, Muse demonstrates the potential to assist in game preservation by recreating classic titles for modern platforms. While still in the early stages, with current outputs at a resolution of 300×180 pixels, Muse represents a significant advancement in integrating AI into the game development process, aiming to enhance, not replace, human creativity.
  • 24
    PaliGemma 2
    PaliGemma 2, the next evolution in tunable vision-language models, builds upon the performant Gemma 2 models, adding the power of vision and making it easier than ever to fine-tune for exceptional performance. With PaliGemma 2, these models can see, understand, and interact with visual input, opening up a world of new possibilities. It offers scalable performance with multiple model sizes (3B, 10B, 28B parameters) and resolutions (224px, 448px, 896px). PaliGemma 2 generates detailed, contextually relevant captions for images, going beyond simple object identification to describe actions, emotions, and the overall narrative of the scene. Our research demonstrates leading performance in chemical formula recognition, music score recognition, spatial reasoning, and chest X-ray report generation, as detailed in the technical report. Upgrading to PaliGemma 2 is a breeze for existing PaliGemma users.
  • 25
    Evo 2

    Evo 2

    Arc Institute

    Evo 2 is a genomic foundation model capable of generalist prediction and design tasks across DNA, RNA, and proteins. It utilizes a frontier deep learning architecture to model biological sequences at single-nucleotide resolution, achieving near-linear scaling of compute and memory relative to context length. Trained with 40 billion parameters and a 1 megabase context length, Evo 2 processes over 9 trillion nucleotides from diverse eukaryotic and prokaryotic genomes. This extensive training enables Evo 2 to perform zero-shot function prediction across multiple biological modalities, including DNA, RNA, and proteins, and to generate novel sequences with plausible genomic architecture. The model's capabilities have been demonstrated in tasks such as designing functional CRISPR systems and predicting disease-causing mutations in human genes. Evo 2 is publicly accessible via Arc's GitHub repository and is integrated into the NVIDIA BioNeMo framework.
  • 26
    Inception Labs

    Inception Labs

    Inception Labs

    Inception Labs is pioneering the next generation of AI with diffusion-based large language models (dLLMs), a breakthrough in AI that offers 10x faster performance and 5-10x lower cost than traditional autoregressive models. Inspired by the success of diffusion models in image and video generation, Inception’s dLLMs introduce enhanced reasoning, error correction, and multimodal capabilities, allowing for more structured and accurate text generation. With applications spanning enterprise AI, research, and content generation, Inception’s approach sets a new standard for speed, efficiency, and control in AI-driven workflows.
  • 27
    Hunyuan T1

    Hunyuan T1

    Tencent

    ​​Hunyuan T1 is Tencent's deep-thinking AI model, now fully open to all users through the Tencent Yuanbao platform. This model excels in understanding multiple dimensions and potential logical relationships, making it suitable for handling complex tasks. Users can experience various AI models on the platform, including DeepSeek-R1 and Tencent Hunyuan Turbo. The official version of the Tencent Hunyuan T1 model will also be launched soon, providing external API access and other services. Built upon Tencent's Hunyuan large language model, Yuanbao excels in Chinese language understanding, logical reasoning, and task execution. It offers AI-based search, summaries, and writing capabilities, enabling users to analyze documents and engage in prompt-based interactions.
  • 28
    Bria.ai

    Bria.ai

    Bria.ai

    Bria.ai is a powerful generative AI platform that specializes in creating and editing images at scale. It provides developers and enterprises with flexible solutions for AI-driven image generation, editing, and customization. Bria.ai offers APIs, iFrames, and pre-built models that allow users to integrate image creation and editing capabilities into their applications. The platform is designed for businesses seeking to enhance their branding, create marketing content, or automate product shot editing. With fully licensed data and customizable tools, Bria.ai ensures businesses can develop scalable, copyright-safe AI solutions.
  • 29
    ERNIE X1
    ERNIE X1 is an advanced conversational AI model developed by Baidu as part of their ERNIE (Enhanced Representation through Knowledge Integration) series. Unlike previous versions, ERNIE X1 is designed to be more efficient in understanding and generating human-like responses. It incorporates cutting-edge machine learning techniques to handle complex queries, making it capable of not only processing text but also generating images and engaging in multimodal communication. ERNIE X1 is often used in natural language processing applications such as chatbots, virtual assistants, and enterprise automation, offering significant improvements in accuracy, contextual understanding, and response quality.
    Starting Price: $0.28 per 1M tokens
  • 30
    Reka Flash 3
    ​Reka Flash 3 is a 21-billion-parameter multimodal AI model developed by Reka AI, designed to excel in general chat, coding, instruction following, and function calling. It processes and reasons with text, images, video, and audio inputs, offering a compact, general-purpose solution for various applications. Trained from scratch on diverse datasets, including publicly accessible and synthetic data, Reka Flash 3 underwent instruction tuning on curated, high-quality data to optimize performance. The final training stage involved reinforcement learning using REINFORCE Leave One-Out (RLOO) with both model-based and rule-based rewards, enhancing its reasoning capabilities. With a context length of 32,000 tokens, Reka Flash 3 performs competitively with proprietary models like OpenAI's o1-mini, making it suitable for low-latency or on-device deployments. The model's full precision requires 39GB (fp16), but it can be compressed to as small as 11GB using 4-bit quantization.