Alternatives to Llama Guard
Compare Llama Guard alternatives for your business or organization using the curated list below. SourceForge ranks the best alternatives to Llama Guard in 2026. Compare features, ratings, user reviews, pricing, and more from Llama Guard competitors and alternatives in order to make an informed decision for your business.
-
1
Llama 3
Meta
We’ve integrated Llama 3 into Meta AI, our intelligent assistant, that expands the ways people can get things done, create and connect with Meta AI. You can see first-hand the performance of Llama 3 by using Meta AI for coding tasks and problem solving. Whether you're developing agents, or other AI-powered applications, Llama 3 in both 8B and 70B will offer the capabilities and flexibility you need to develop your ideas. With the release of Llama 3, we’ve updated the Responsible Use Guide (RUG) to provide the most comprehensive information on responsible development with LLMs. Our system-centric approach includes updates to our trust and safety tools with Llama Guard 2, optimized to support the newly announced taxonomy published by MLCommons expanding its coverage to a more comprehensive set of safety categories, code shield, and Cybersec Eval 2.Starting Price: Free -
2
OpenPipe
OpenPipe
OpenPipe provides fine-tuning for developers. Keep your datasets, models, and evaluations all in one place. Train new models with the click of a button. Automatically record LLM requests and responses. Create datasets from your captured data. Train multiple base models on the same dataset. We serve your model on our managed endpoints that scale to millions of requests. Write evaluations and compare model outputs side by side. Change a couple of lines of code, and you're good to go. Simply replace your Python or Javascript OpenAI SDK and add an OpenPipe API key. Make your data searchable with custom tags. Small specialized models cost much less to run than large multipurpose LLMs. Replace prompts with models in minutes, not weeks. Fine-tuned Mistral and Llama 2 models consistently outperform GPT-4-1106-Turbo, at a fraction of the cost. We're open-source, and so are many of the base models we use. Own your own weights when you fine-tune Mistral and Llama 2, and download them at any time.Starting Price: $1.20 per 1M tokens -
3
OpenLLaMA
OpenLLaMA
OpenLLaMA is a permissively licensed open source reproduction of Meta AI’s LLaMA 7B trained on the RedPajama dataset. Our model weights can serve as the drop in replacement of LLaMA 7B in existing implementations. We also provide a smaller 3B variant of LLaMA model.Starting Price: Free -
4
Llama Stack
Meta
Llama Stack is a modular framework designed to streamline the development of applications powered by Meta's Llama language models. It offers a client-server architecture with flexible configurations, allowing developers to mix and match various providers for components such as inference, memory, agents, telemetry, and evaluations. The framework includes pre-configured distributions tailored for different deployment scenarios, enabling seamless transitions from local development to production environments. Developers can interact with the Llama Stack server using client SDKs available in multiple programming languages, including Python, Node.js, Swift, and Kotlin. Comprehensive documentation and example applications are provided to assist users in building and deploying Llama-based applications efficiently.Starting Price: Free -
5
Llama 2
Meta
The next generation of our open source large language model. This release includes model weights and starting code for pretrained and fine-tuned Llama language models — ranging from 7B to 70B parameters. Llama 2 pretrained models are trained on 2 trillion tokens, and have double the context length than Llama 1. Its fine-tuned models have been trained on over 1 million human annotations. Llama 2 outperforms other open source language models on many external benchmarks, including reasoning, coding, proficiency, and knowledge tests. Llama 2 was pretrained on publicly available online data sources. The fine-tuned model, Llama-2-chat, leverages publicly available instruction datasets and over 1 million human annotations. We have a broad range of supporters around the world who believe in our open approach to today’s AI — companies that have given early feedback and are excited to build with Llama 2.Starting Price: Free -
6
Alpaca
Stanford Center for Research on Foundation Models (CRFM)
Instruction-following models such as GPT-3.5 (text-DaVinci-003), ChatGPT, Claude, and Bing Chat have become increasingly powerful. Many users now interact with these models regularly and even use them for work. However, despite their widespread deployment, instruction-following models still have many deficiencies: they can generate false information, propagate social stereotypes, and produce toxic language. To make maximum progress on addressing these pressing problems, it is important for the academic community to engage. Unfortunately, doing research on instruction-following models in academia has been difficult, as there is no easily accessible model that comes close in capabilities to closed-source models such as OpenAI’s text-DaVinci-003. We are releasing our findings about an instruction-following language model, dubbed Alpaca, which is fine-tuned from Meta’s LLaMA 7B model. -
7
PygmalionAI
PygmalionAI
PygmalionAI is a community dedicated to creating open-source projects based on EleutherAI's GPT-J 6B and Meta's LLaMA models. In simple terms, Pygmalion makes AI fine-tuned for chatting and roleplaying purposes. The current actively supported Pygmalion AI model is the 7B variant, based on Meta AI's LLaMA model. With only 18GB (or less) VRAM required, Pygmalion offers better chat capability than much larger language models with relatively minimal resources. Our curated dataset of high-quality roleplaying data ensures that your bot will be the optimal RP partner. Both the model weights and the code used to train it are completely open-source, and you can modify/re-distribute it for whatever purpose you want. Language models, including Pygmalion, generally run on GPUs since they need access to fast memory and massive processing power in order to output coherent text at an acceptable speed.Starting Price: Free -
8
Llama 4 Behemoth
Meta
Llama 4 Behemoth is Meta's most powerful AI model to date, featuring a massive 288 billion active parameters. It excels in multimodal tasks, outperforming previous models like GPT-4.5 and Gemini 2.0 Pro across multiple STEM-focused benchmarks such as MATH-500 and GPQA Diamond. As the teacher model for the Llama 4 series, Behemoth sets the foundation for models like Llama 4 Maverick and Llama 4 Scout. While still in training, Llama 4 Behemoth demonstrates unmatched intelligence, pushing the boundaries of AI in fields like math, multilinguality, and image understanding.Starting Price: Free -
9
Vicuna
lmsys.org
Vicuna-13B is an open-source chatbot trained by fine-tuning LLaMA on user-shared conversations collected from ShareGPT. Preliminary evaluation using GPT-4 as a judge shows Vicuna-13B achieves more than 90%* quality of OpenAI ChatGPT and Google Bard while outperforming other models like LLaMA and Stanford Alpaca in more than 90%* of cases. The cost of training Vicuna-13B is around $300. The code and weights, along with an online demo, are publicly available for non-commercial use.Starting Price: Free -
10
LlamaIndex
LlamaIndex
LlamaIndex is a “data framework” to help you build LLM apps. Connect semi-structured data from API's like Slack, Salesforce, Notion, etc. LlamaIndex is a simple, flexible data framework for connecting custom data sources to large language models. LlamaIndex provides the key tools to augment your LLM applications with data. Connect your existing data sources and data formats (API's, PDF's, documents, SQL, etc.) to use with a large language model application. Store and index your data for different use cases. Integrate with downstream vector store and database providers. LlamaIndex provides a query interface that accepts any input prompt over your data and returns a knowledge-augmented response. Connect unstructured sources such as documents, raw text files, PDF's, videos, images, etc. Easily integrate structured data sources from Excel, SQL, etc. Provides ways to structure your data (indices, graphs) so that this data can be easily used with LLMs. -
11
Defense Llama
Scale AI
Scale AI is proud to announce Defense Llama, the Large Language Model (LLM) built on Meta’s Llama 3 that is specifically customized and fine-tuned to support American national security missions. Defense Llama, available exclusively in controlled U.S. government environments within Scale Donovan, empowers our service members and national security professionals to apply the power of generative AI to their unique use cases, such as planning military or intelligence operations and understanding adversary vulnerabilities. Defense Llama was trained on a vast dataset, including military doctrine, international humanitarian law, and relevant policies designed to align with the Department of Defense (DoD) guidelines for armed conflict as well as the DoD’s Ethical Principles for Artificial Intelligence. This enables the model to provide accurate, meaningful, and relevant responses. Scale is proud to enable U.S. national security personnel to use generative AI safely and securely for defense. -
12
Llama
Meta
Llama (Large Language Model Meta AI) is a state-of-the-art foundational large language model designed to help researchers advance their work in this subfield of AI. Smaller, more performant models such as Llama enable others in the research community who don’t have access to large amounts of infrastructure to study these models, further democratizing access in this important, fast-changing field. Training smaller foundation models like Llama is desirable in the large language model space because it requires far less computing power and resources to test new approaches, validate others’ work, and explore new use cases. Foundation models train on a large set of unlabeled data, which makes them ideal for fine-tuning for a variety of tasks. We are making Llama available at several sizes (7B, 13B, 33B, and 65B parameters) and also sharing a Llama model card that details how we built the model in keeping with our approach to Responsible AI practices. -
13
Code Llama
Meta
Code Llama is a large language model (LLM) that can use text prompts to generate code. Code Llama is state-of-the-art for publicly available LLMs on code tasks, and has the potential to make workflows faster and more efficient for current developers and lower the barrier to entry for people who are learning to code. Code Llama has the potential to be used as a productivity and educational tool to help programmers write more robust, well-documented software. Code Llama is a state-of-the-art LLM capable of generating code, and natural language about code, from both code and natural language prompts. Code Llama is free for research and commercial use. Code Llama is built on top of Llama 2 and is available in three models: Code Llama, the foundational code model; Codel Llama - Python specialized for Python; and Code Llama - Instruct, which is fine-tuned for understanding natural language instructions.Starting Price: Free -
14
DeepEval
Confident AI
DeepEval is a simple-to-use, open source LLM evaluation framework, for evaluating and testing large-language model systems. It is similar to Pytest but specialized for unit testing LLM outputs. DeepEval incorporates the latest research to evaluate LLM outputs based on metrics such as G-Eval, hallucination, answer relevancy, RAGAS, etc., which uses LLMs and various other NLP models that run locally on your machine for evaluation. Whether your application is implemented via RAG or fine-tuning, LangChain, or LlamaIndex, DeepEval has you covered. With it, you can easily determine the optimal hyperparameters to improve your RAG pipeline, prevent prompt drifting, or even transition from OpenAI to hosting your own Llama2 with confidence. The framework supports synthetic dataset generation with advanced evolution techniques and integrates seamlessly with popular frameworks, allowing for efficient benchmarking and optimization of LLM systems.Starting Price: Free -
15
Llama 3.1
Meta
The open source AI model you can fine-tune, distill and deploy anywhere. Our latest instruction-tuned model is available in 8B, 70B and 405B versions. Using our open ecosystem, build faster with a selection of differentiated product offerings to support your use cases. Choose from real-time inference or batch inference services. Download model weights to further optimize cost per token. Adapt for your application, improve with synthetic data and deploy on-prem or in the cloud. Use Llama system components and extend the model using zero shot tool use and RAG to build agentic behaviors. Leverage 405B high quality data to improve specialized models for specific use cases.Starting Price: Free -
16
Llama 4 Maverick
Meta
Llama 4 Maverick is one of the most advanced multimodal AI models from Meta, featuring 17 billion active parameters and 128 experts. It surpasses its competitors like GPT-4o and Gemini 2.0 Flash in a broad range of benchmarks, especially in tasks related to coding, reasoning, and multilingual capabilities. Llama 4 Maverick combines image and text understanding, enabling it to deliver industry-leading results in image-grounding tasks and precise, high-quality output. With its efficient performance at a reduced parameter size, Maverick offers exceptional value, especially in general assistant and chat applications.Starting Price: Free -
17
NVIDIA NeMo Guardrails
NVIDIA
NVIDIA NeMo Guardrails is an open-source toolkit designed to enhance the safety, security, and compliance of large language model-based conversational applications. It enables developers to define, orchestrate, and enforce multiple AI guardrails, ensuring that generative AI interactions remain accurate, appropriate, and on-topic. The toolkit leverages Colang, a specialized language for designing flexible dialogue flows, and integrates seamlessly with popular AI development frameworks like LangChain and LlamaIndex. NeMo Guardrails offers features such as content safety, topic control, personal identifiable information detection, retrieval-augmented generation enforcement, and jailbreak prevention. Additionally, the recently introduced NeMo Guardrails microservice simplifies rail orchestration with API-based interaction and tools for enhanced guardrail management and maintenance. -
18
Dify
Dify
Dify is an open-source platform designed to streamline the development and operation of generative AI applications. It offers a comprehensive suite of tools, including an intuitive orchestration studio for visual workflow design, a Prompt IDE for prompt testing and refinement, and enterprise-level LLMOps capabilities for monitoring and optimizing large language models. Dify supports integration with various LLMs, such as OpenAI's GPT series and open-source models like Llama, providing flexibility for developers to select models that best fit their needs. Additionally, its Backend-as-a-Service (BaaS) features enable seamless incorporation of AI functionalities into existing enterprise systems, facilitating the creation of AI-powered chatbots, document summarization tools, and virtual assistants. -
19
TinyLlama
TinyLlama
The TinyLlama project aims to pretrain a 1.1B Llama model on 3 trillion tokens. With some proper optimization, we can achieve this within a span of "just" 90 days using 16 A100-40G GPUs. We adopted exactly the same architecture and tokenizer as Llama 2. This means TinyLlama can be plugged and played in many open-source projects built upon Llama. Besides, TinyLlama is compact with only 1.1B parameters. This compactness allows it to cater to a multitude of applications demanding a restricted computation and memory footprint.Starting Price: Free -
20
Tülu 3
Ai2
Tülu 3 is an advanced instruction-following language model developed by the Allen Institute for AI (Ai2), designed to enhance capabilities in areas such as knowledge, reasoning, mathematics, coding, and safety. Built upon the Llama 3 Base, Tülu 3 employs a comprehensive four-stage post-training process: meticulous prompt curation and synthesis, supervised fine-tuning on a diverse set of prompts and completions, preference tuning using both off- and on-policy data, and a novel reinforcement learning approach to bolster specific skills with verifiable rewards. This open-source model distinguishes itself by providing full transparency, including access to training data, code, and evaluation tools, thereby closing the performance gap between open and proprietary fine-tuning methods. Evaluations indicate that Tülu 3 outperforms other open-weight models of similar size, such as Llama 3.1-Instruct and Qwen2.5-Instruct, across various benchmarks.Starting Price: Free -
21
Solar Mini
Upstage AI
Solar Mini is a pre‑trained large language model that delivers GPT‑3.5‑comparable responses with 2.5× faster inference while staying under 30 billion parameters. It achieved first place on the Hugging Face Open LLM Leaderboard in December 2023 by combining a 32‑layer Llama 2 architecture, initialized with high‑quality Mistral 7B weights, with an innovative “depth up‑scaling” (DUS) approach that deepens the model efficiently without adding complex modules. After DUS, continued pretraining restores and enhances performance, and instruction tuning in a QA format, especially for Korean, refines its ability to follow user prompts, while alignment tuning ensures its outputs meet human or advanced AI preferences. Solar Mini outperforms competitors such as Llama 2, Mistral 7B, Ko‑Alpaca, and KULLM across a variety of benchmarks, proving that compact size need not sacrifice capability.Starting Price: $0.1 per 1M tokens -
22
Falcon 2
Technology Innovation Institute (TII)
Falcon 2 11B is an open-source, multilingual, and multimodal AI model, uniquely equipped with vision-to-language capabilities. It surpasses Meta’s Llama 3 8B and delivers performance on par with Google’s Gemma 7B, as independently confirmed by the Hugging Face Leaderboard. Looking ahead, the next phase of development will integrate a 'Mixture of Experts' approach to further enhance Falcon 2’s capabilities, pushing the boundaries of AI innovation.Starting Price: Free -
23
kluster.ai
kluster.ai
Kluster.ai is a developer-centric AI cloud platform designed to deploy, scale, and fine-tune large language models (LLMs) with speed and efficiency. Built for developers by developers, it offers Adaptive Inference, a flexible and scalable service that adjusts seamlessly to workload demands, ensuring high-performance processing and consistent turnaround times. Adaptive Inference provides three distinct processing options: real-time inference for ultra-low latency needs, asynchronous inference for cost-effective handling of flexible timing tasks, and batch inference for efficient processing of high-volume, bulk tasks. It supports a range of open-weight, cutting-edge multimodal models for chat, vision, code, and more, including Meta's Llama 4 Maverick and Scout, Qwen3-235B-A22B, DeepSeek-R1, and Gemma 3 . Kluster.ai's OpenAI-compatible API allows developers to integrate these models into their applications seamlessly.Starting Price: $0.15per input -
24
Llama 3.3
Meta
Llama 3.3 is the latest iteration in the Llama series of language models, developed to push the boundaries of AI-powered understanding and communication. With enhanced contextual reasoning, improved language generation, and advanced fine-tuning capabilities, Llama 3.3 is designed to deliver highly accurate, human-like responses across diverse applications. This version features a larger training dataset, refined algorithms for nuanced comprehension, and reduced biases compared to its predecessors. Llama 3.3 excels in tasks such as natural language understanding, creative writing, technical explanation, and multilingual communication, making it an indispensable tool for businesses, developers, and researchers. Its modular architecture allows for customizable deployment in specialized domains, ensuring versatility and performance at scale.Starting Price: Free -
25
Featherless
Featherless
Featherless is an AI model provider that offers our subscribers access to a continually expanding library of Hugging Face models. With hundreds of new models daily, you need dedicated tools to keep up with the hype. No matter your use case, find and use the state-of-the-art AI model with Featherless. At present, we support LLaMA-3-based models, including LLaMA-3 and QWEN-2. Note that QWEN-2 models are only supported up to 16,000 context length. We plan to add more architectures to our supported list soon. We continuously onboard new models as they become available on Hugging Face. As we grow, we aim to automate this process to encompass all publicly available Hugging Face models with compatible architecture. To ensure fair individual account use, concurrent requests are limited according to the plan you've selected. Output is delivered at a speed of 10-40 tokens per second, depending on the model and prompt size.Starting Price: $10 per month -
26
Hermes 3
Nous Research
Experiment, and push the boundaries of individual alignment, artificial consciousness, open-source software, and decentralization, in ways that monolithic companies and governments are too afraid to try. Hermes 3 contains advanced long-term context retention and multi-turn conversation capability, complex roleplaying and internal monologue abilities, and enhanced agentic function-calling. Our training data aggressively encourages the model to follow the system and instruction prompts exactly and in an adaptive manner. Hermes 3 was created by fine-tuning Llama 3.1 8B, 70B, and 405B, and training on a dataset of primarily synthetically generated responses. The model boasts comparable and superior performance to Llama 3.1 while unlocking deeper capabilities in reasoning and creativity. Hermes 3 is a series of instruct and tool-use models with strong reasoning and creative abilities.Starting Price: Free -
27
Sup AI
Sup AI
Sup AI is a multi-LLM platform that merges outputs from several top large language models, such as GPT, Claude, Llama, and more, to generate richer, more accurate, and better-validated answers than any single model could provide. It applies real-time “logprob confidence scoring,” analyzing each token’s probability to detect uncertainty or hallucination; when a model’s confidence falls below a threshold, the response is halted, helping ensure that delivered answers remain high-quality and trustworthy. Sup’s “multi-model fusion” then compares, contrasts, and consolidates outputs from different models, cross-verifying and synthesizing the best parts into a final result. Sup also supports “multimodal RAG” (retrieval-augmented generation) to incorporate external data (text, PDFs, images) into context-aware responses, giving the AI access to factual sources and helping it “never forget” relevant information.Starting Price: $20 per month -
28
Falcon Mamba 7B
Technology Innovation Institute (TII)
Falcon Mamba 7B is the first open-source State Space Language Model (SSLM), introducing a groundbreaking architecture for Falcon models. Recognized as the top-performing open-source SSLM worldwide by Hugging Face, it sets a new benchmark in AI efficiency. Unlike traditional transformers, SSLMs operate with minimal memory requirements and can generate extended text sequences without additional overhead. Falcon Mamba 7B surpasses leading transformer-based models, including Meta’s Llama 3.1 8B and Mistral’s 7B, showcasing superior performance. This innovation underscores Abu Dhabi’s commitment to advancing AI research and development on a global scale.Starting Price: Free -
29
LongLLaMA
LongLLaMA
This repository contains the research preview of LongLLaMA, a large language model capable of handling long contexts of 256k tokens or even more. LongLLaMA is built upon the foundation of OpenLLaMA and fine-tuned using the Focused Transformer (FoT) method. LongLLaMA code is built upon the foundation of Code Llama. We release a smaller 3B base variant (not instruction tuned) of the LongLLaMA model on a permissive license (Apache 2.0) and inference code supporting longer contexts on hugging face. Our model weights can serve as the drop-in replacement of LLaMA in existing implementations (for short context up to 2048 tokens). Additionally, we provide evaluation results and comparisons against the original OpenLLaMA models.Starting Price: Free -
30
NVIDIA Llama Nemotron
NVIDIA
NVIDIA Llama Nemotron is a family of advanced language models optimized for reasoning and a diverse set of agentic AI tasks. These models excel in graduate-level scientific reasoning, advanced mathematics, coding, instruction following, and tool calls. Designed for deployment across various platforms, from data centers to PCs, they offer the flexibility to toggle reasoning capabilities on or off, reducing inference costs when deep reasoning isn't required. The Llama Nemotron family includes models tailored for different deployment needs. Built upon Llama models and enhanced by NVIDIA through post-training, these models demonstrate improved accuracy, up to 20% over base models, and optimized inference speeds, achieving up to five times the performance of other leading open reasoning models. This efficiency enables handling more complex reasoning tasks, enhances decision-making capabilities, and reduces operational costs for enterprises. -
31
DefiLlama
DefiLlama
DefiLlama is committed to accurate data without ads or sponsored content and transparency. We list DeFi projects from all chains. The majority of adapters on DefiLlama are contributed and maintained by their respective communities, with all changes being coordinated through the DefiLlama/DefiLlama-Adapters github repo. Collects data on a protocol by calling some endpoints or making some blockchain calls. Computes the TVL of a protocol and returns it. Right now our SDK only supports EVM chains, so if your project is in any of these chains you should develop a SDK-based adapter, while if your project is on another chain a fetch adapter is the way to go. The adapter is just a function that takes a timestamp and block-height (on Ethereum) and returns the balances of tokens locked in your protocol's smart contracts at that point in time. -
32
ChainForge
ChainForge
ChainForge is an open-source visual programming environment designed for prompt engineering and large language model evaluation. It enables users to assess the robustness of prompts and text-generation models beyond anecdotal evidence. Simultaneously test prompt ideas and variations across multiple LLMs to identify the most effective combinations. Evaluate response quality across different prompts, models, and settings to select the optimal configuration for specific use cases. Set up evaluation metrics and visualize results across prompts, parameters, models, and settings, facilitating data-driven decision-making. Manage multiple conversations simultaneously, template follow-up messages, and inspect outputs at each turn to refine interactions. ChainForge supports various model providers, including OpenAI, HuggingFace, Anthropic, Google PaLM2, Azure OpenAI endpoints, and locally hosted models like Alpaca and Llama. Users can adjust model settings and utilize visualization nodes. -
33
Mixtral 8x7B
Mistral AI
Mixtral 8x7B is a high-quality sparse mixture of experts model (SMoE) with open weights. Licensed under Apache 2.0. Mixtral outperforms Llama 2 70B on most benchmarks with 6x faster inference. It is the strongest open-weight model with a permissive license and the best model overall regarding cost/performance trade-offs. In particular, it matches or outperforms GPT-3.5 on most standard benchmarks.Starting Price: Free -
34
LFM2.5
Liquid AI
Liquid AI’s LFM2.5 is the next generation of on-device AI foundation models designed to deliver high-performance, efficient AI inference on edge devices such as phones, laptops, vehicles, IoT systems, and embedded hardware without relying on cloud compute. It extends the previous LFM2 architecture by significantly increasing the pretraining scale and reinforcement learning stages, yielding a family of hybrid models around 1.2 billion parameters that balance instruction following, reasoning, and multimodal capabilities for real-world agentic use cases. The LFM2.5 family includes Base (for fine-tuning and customization), Instruct (general-purpose instruction-tuned), Japanese-optimized, Vision-Language, and Audio-Language variants, all optimized for fast, on-device inference under tight memory constraints and available as open-weight models deployable via frameworks like llama.cpp, MLX, vLLM, and ONNX.Starting Price: Free -
35
Overseer AI
Overseer AI
Overseer AI is a platform designed to ensure AI-generated content is safe, accurate, and aligned with user-defined policies. It offers compliance enforcement by automating adherence to regulatory standards through custom policy rules, real-time content moderation to block harmful, toxic, or biased outputs from AI, debugging AI outputs by testing and monitoring responses against custom safety policies, policy-driven AI governance by applying centralized safety rules across all AI interactions, and trust-building for AI by guaranteeing safe, accurate, and brand-compliant outputs. The platform caters to various industries, including healthcare, finance, legal technology, customer support, education technology, and ecommerce & retail, providing tailored solutions to ensure AI responses align with industry-specific regulations and standards. Developers can access comprehensive guides and API references to integrate Overseer AI into their applications.Starting Price: $99 per month -
36
Oumi
Oumi
Oumi is a fully open source platform that streamlines the entire lifecycle of foundation models, from data preparation and training to evaluation and deployment. It supports training and fine-tuning models ranging from 10 million to 405 billion parameters using state-of-the-art techniques such as SFT, LoRA, QLoRA, and DPO. The platform accommodates both text and multimodal models, including architectures like Llama, DeepSeek, Qwen, and Phi. Oumi offers tools for data synthesis and curation, enabling users to generate and manage training datasets effectively. For deployment, it integrates with popular inference engines like vLLM and SGLang, ensuring efficient model serving. The platform also provides comprehensive evaluation capabilities across standard benchmarks to assess model performance. Designed for flexibility, Oumi can run on various environments, from local laptops to cloud infrastructures such as AWS, Azure, GCP, and Lambda.Starting Price: Free -
37
Llama 3.2
Meta
The open-source AI model you can fine-tune, distill and deploy anywhere is now available in more versions. Choose from 1B, 3B, 11B or 90B, or continue building with Llama 3.1. Llama 3.2 is a collection of large language models (LLMs) pretrained and fine-tuned in 1B and 3B sizes that are multilingual text only, and 11B and 90B sizes that take both text and image inputs and output text. Develop highly performative and efficient applications from our latest release. Use our 1B or 3B models for on device applications such as summarizing a discussion from your phone or calling on-device tools like calendar. Use our 11B or 90B models for image use cases such as transforming an existing image into something new or getting more information from an image of your surroundings.Starting Price: Free -
38
HumanLayer
HumanLayer
HumanLayer is an API and SDK that enables AI agents to contact humans for feedback, input, and approvals. It guarantees human oversight of high-stakes function calls with approval workflows across Slack, email, and more. By integrating with your preferred Large Language Model (LLM) and framework, HumanLayer empowers AI agents with safe access to the world. The platform supports various frameworks and LLMs, including LangChain, CrewAI, ControlFlow, LlamaIndex, Haystack, OpenAI, Claude, Llama3.1, Mistral, Gemini, and Cohere. HumanLayer offers features such as approval workflows, human-as-tool integration, and custom responses with escalations. Pre-fill response prompts for seamless human-agent interactions. Route to specific individuals or teams, and control which users can approve or respond to LLM requests. Invert the flow of control, from human-initiated to agent-initiated. Add a variety of human contact channels to your agent toolchain.Starting Price: $500 per month -
39
EXAONE Deep
LG
EXAONE Deep is a series of reasoning-enhanced language models developed by LG AI Research, featuring parameter sizes of 2.4 billion, 7.8 billion, and 32 billion. These models demonstrate superior capabilities in various reasoning tasks, including math and coding benchmarks. Notably, EXAONE Deep 2.4B outperforms other models of comparable size, EXAONE Deep 7.8B surpasses both open-weight models of similar scale and the proprietary reasoning model OpenAI o1-mini, and EXAONE Deep 32B shows competitive performance against leading open-weight models. The repository provides comprehensive documentation covering performance evaluations, quickstart guides for using EXAONE Deep models with Transformers, explanations of quantized EXAONE Deep weights in AWQ and GGUF formats, and instructions for running EXAONE Deep models locally using frameworks like llama.cpp and Ollama.Starting Price: Free -
40
Chainlit
Chainlit
Chainlit is an open-source Python package designed to expedite the development of production-ready conversational AI applications. With Chainlit, developers can build and deploy chat-based interfaces in minutes, not weeks. The platform offers seamless integration with popular AI tools and frameworks, including OpenAI, LangChain, and LlamaIndex, allowing for versatile application development. Key features of Chainlit include multimodal capabilities, enabling the processing of images, PDFs, and other media types to enhance productivity. It also provides robust authentication options, supporting integration with providers like Okta, Azure AD, and Google. The Prompt Playground feature allows developers to iterate on prompts in context, adjusting templates, variables, and LLM settings for optimal results. For observability, Chainlit offers real-time visualization of prompts, completions, and usage metrics, ensuring efficient and trustworthy LLM operations. -
41
Property Llama
Property Llama
Property Llama transforms real estate portfolio management, replacing cumbersome spreadsheets with an intuitive, user-friendly app. Created by real estate investors FOR real estate investors, it streamlines portfolio management with advanced financial modeling and personalized insights. With Property Llama, managing your real estate investments has never been easier or more efficient!Starting Price: Free -
42
Llama 4 Scout
Meta
Llama 4 Scout is a powerful 17 billion active parameter multimodal AI model that excels in both text and image processing. With an industry-leading context length of 10 million tokens, it outperforms its predecessors, including Llama 3, in tasks such as multi-document summarization and parsing large codebases. Llama 4 Scout is designed to handle complex reasoning tasks while maintaining high efficiency, making it perfect for use cases requiring long-context comprehension and image grounding. It offers cutting-edge performance in image-related tasks and is particularly well-suited for applications requiring both text and visual understanding.Starting Price: Free -
43
Bodyguard
Bodyguard
Bodyguard protects your online communities and platforms against toxic content, cyberharassment and hate speech. Leverage the power of positive interactions, shield your communities from the negative ones. Different toxic content categories and severity levels. Contextual analysis. Internet language “decoding”. From a few blog comments, to thousands of social media comments, all the way up to live streaming. Powerful data bank to inform content decisions and uncover new ways to engage with your followers. Choose what toxic content categories you want to moderate. Safe platforms are 3x more likely to retain users and to attract new users towards the community. A lack of toxic content means visitors will spend around 60% more time on your platforms. Protect your brand image, users and employees. Don’t attach your business or brand to toxic content. Smooth, quick integration via API. Works with any platform. Pricing based on your needs. -
44
Stable Beluga
Stability AI
Stability AI and its CarperAI lab proudly announce Stable Beluga 1 and its successor Stable Beluga 2 (formerly codenamed FreeWilly), two powerful new, open access, Large Language Models (LLMs). Both models demonstrate exceptional reasoning ability across varied benchmarks. Stable Beluga 1 leverages the original LLaMA 65B foundation model and was carefully fine-tuned with a new synthetically-generated dataset using Supervised Fine-Tune (SFT) in standard Alpaca format. Similarly, Stable Beluga 2 leverages the LLaMA 2 70B foundation model to achieve industry-leading performance.Starting Price: Free -
45
Discuro
Discuro
Discuro is the all-in-one platform for developers looking to easily build, test & consume complex AI workflows. Define your workflow in our easy-to-use UI, and when you're ready to execute, simply make one API call to us, with your inputs, any meta-data, and we'll do the rest. Use an Orchestrator to feed generated data back into GPT-3. Reliably integrate with OpenAI and extract the data you need with ease. Create & consume your own flows in minutes. We've built everything you need to integrate with OpenAI, at scale, so you can focus on the product. The first challenge in integrating with OpenAI is extracting the data you need, we'll handle this for you by collecting input/output definitions. Easily chain completions together to build large data sets. Use our iterative input feature to feed GPT-3 output back in, and have us make consecutive calls to expand your data set, and much more. Easily build & test complex self-transforming AI workflows & datasets.Starting Price: $34 per month -
46
Alumnium
Alumnium
Alumnium is an open source AI-powered test automation tool that bridges the gap between human and automated testing by translating plain-language test instructions into executable browser commands. It integrates seamlessly with popular web automation tools like Selenium and Playwright, allowing software and test engineers to accelerate browser test creation without sacrificing precision or control. Alumnium supports any Python test framework and leverages large language models (LLMs) from providers such as Anthropic, Google Gemini, OpenAI, and Meta Llama to interpret instructions and generate browser interactions. Users can write test cases using simple commands: do to describe steps, check to verify results, and get to extract data from the page. Alumnium utilizes the web page's accessibility tree and, if needed, screenshots to execute tests, ensuring compatibility with various web applications.Starting Price: Free -
47
AI-FLOW
AI-Flow
AI-FLOW is an innovative open-source platform designed to simplify how creators and innovators harness the power of artificial intelligence. With its user-friendly drag-and-drop interface, AI-FLOW enables you to effortlessly connect and combine leading AI models, crafting custom AI tools tailored to your unique needs. Key Features: 1. Diverse AI Model Integration: Gain access to a suite of top-tier AI models, including GPT-4, DALL-E 3, Stable Diffusion, Mistral, LLaMA, and more—all in one convenient location. 2. Drag-and-Drop Interface: Build complex AI workflows with ease—no coding required—thanks to our intuitive design. 3. Custom AI Tool Creation: Design bespoke AI solutions quickly, from image generation to language processing. 4. Local Data Storage: Maintain full control over your data with options for local storage and the ability to export as JSON files.Starting Price: $9/500 credits -
48
fullmoon
fullmoon
Fullmoon is a free, open source application that enables users to interact with large language models directly on their devices, ensuring privacy and offline accessibility. Optimized for Apple silicon, it operates seamlessly across iOS, iPadOS, macOS, and visionOS platforms. Users can personalize the app by adjusting themes, fonts, and system prompts, and it integrates with Apple's Shortcuts for enhanced functionality. Fullmoon supports models like Llama-3.2-1B-Instruct-4bit and Llama-3.2-3B-Instruct-4bit, facilitating efficient on-device AI interactions without the need for an internet connection.Starting Price: Free -
49
WebLLM
WebLLM
WebLLM is a high-performance, in-browser language model inference engine that leverages WebGPU for hardware acceleration, enabling powerful LLM operations directly within web browsers without server-side processing. It offers full OpenAI API compatibility, allowing seamless integration with functionalities such as JSON mode, function-calling, and streaming. WebLLM natively supports a range of models, including Llama, Phi, Gemma, RedPajama, Mistral, and Qwen, making it versatile for various AI tasks. Users can easily integrate and deploy custom models in MLC format, adapting WebLLM to specific needs and scenarios. The platform facilitates plug-and-play integration through package managers like NPM and Yarn, or directly via CDN, complemented by comprehensive examples and a modular design for connecting with UI components. It supports streaming chat completions for real-time output generation, enhancing interactive applications like chatbots and virtual assistants.Starting Price: Free -
50
Supernovas AI LLM
Supernovas AI LLM
Supernovas AI is a unified, team‑focused AI workspace that provides seamless access to all leading LLMs—including GPT‑4.1/4.5 Turbo, Claude Haiku/Sonnet/Opus, Gemini 2.5 Pro/Pro, Azure OpenAI, AWS Bedrock, Mistral, Meta LLaMA, Deepseek, Qwen, and more—through a single, secure interface. It offers essential chat tools like model access, prompt templates, bookmarks, static artifacts, and integrated web search, along with advanced features such as Model Context Protocol (MCP), a talk-to-your data knowledge base, built-in image generation and editing, memory‑enabled agents, and code execution. Supernovas AI simplifies AI tool management by eliminating multiple subscriptions and API keys, enabling fast onboarding and enterprise-grade privacy and collaboration—all from one streamlined platform.Starting Price: $19/month