Open Source C++ Large Language Models (LLM)

C++ Large Language Models (LLM)

View 344 business solutions

Browse free open source C++ Large Language Models (LLM) and projects below. Use the toggles on the left to filter open source C++ Large Language Models (LLM) by OS, license, language, programming language, and project status.

  • Build AI Apps with Gemini 3 on Vertex AI Icon
    Build AI Apps with Gemini 3 on Vertex AI

    Access Google’s most capable multimodal models. Train, test, and deploy AI with 200+ foundation models on one platform.

    Vertex AI gives developers access to Gemini 3—Google’s most advanced reasoning and coding model—plus 200+ foundation models including Claude, Llama, and Gemma. Build generative AI apps with Vertex AI Studio, customize with fine-tuning, and deploy to production with enterprise-grade MLOps. New customers get $300 in free credits.
    Try Vertex AI Free
  • $300 in Free Credit for Your Google Cloud Projects Icon
    $300 in Free Credit for Your Google Cloud Projects

    Build, test, and explore on Google Cloud with $300 in free credit. No hidden charges. No surprise bills.

    Launch your next project with $300 in free Google Cloud credit—no hidden charges. Test, build, and deploy without risk. Use your credit across the Google Cloud platform to find what works best for your needs. After your credits are used, continue building with free monthly usage products. Only pay when you're ready to scale. Sign up in minutes and start exploring.
    Start Free Trial
  • 1
    GPT4All

    GPT4All

    Run Local LLMs on Any Device. Open-source

    GPT4All is an open-source project that allows users to run large language models (LLMs) locally on their desktops or laptops, eliminating the need for API calls or GPUs. The software provides a simple, user-friendly application that can be downloaded and run on various platforms, including Windows, macOS, and Ubuntu, without requiring specialized hardware. It integrates with the llama.cpp implementation and supports multiple LLMs, allowing users to interact with AI models privately. This project also supports Python integrations for easy automation and customization. GPT4All is ideal for individuals and businesses seeking private, offline access to powerful LLMs.
    Downloads: 187 This Week
    Last Update:
    See Project
  • 2
    llama.cpp

    llama.cpp

    Port of Facebook's LLaMA model in C/C++

    The llama.cpp project enables the inference of Meta's LLaMA model (and other models) in pure C/C++ without requiring a Python runtime. It is designed for efficient and fast model execution, offering easy integration for applications needing LLM-based capabilities. The repository focuses on providing a highly optimized and portable implementation for running large language models directly within C/C++ environments.
    Downloads: 87 This Week
    Last Update:
    See Project
  • 3
    llamafile

    llamafile

    Distribute and run LLMs with a single file

    llamafile lets you distribute and run LLMs with a single file. (announcement blog post). Our goal is to make open LLMs much more accessible to both developers and end users. We're doing that by combining llama.cpp with Cosmopolitan Libc into one framework that collapses all the complexity of LLMs down to a single-file executable (called a "llamafile") that runs locally on most computers, with no installation. The easiest way to try it for yourself is to download our example llamafile for the LLaVA model (license: LLaMA 2, OpenAI). LLaVA is a new LLM that can do more than just chat; you can also upload images and ask it questions about them. With llamafile, this all happens locally; no data ever leaves your computer.
    Downloads: 31 This Week
    Last Update:
    See Project
  • 4
    Alpaca.cpp

    Alpaca.cpp

    Locally run an Instruction-Tuned Chat-Style LLM

    Run a fast ChatGPT-like model locally on your device. This combines the LLaMA foundation model with an open reproduction of Stanford Alpaca a fine-tuning of the base model to obey instructions (akin to the RLHF used to train ChatGPT) and a set of modifications to llama.cpp to add a chat interface. Download the zip file corresponding to your operating system from the latest release. The weights are based on the published fine-tunes from alpaca-lora, converted back into a PyTorch checkpoint with a modified script and then quantized with llama.cpp the regular way.
    Downloads: 11 This Week
    Last Update:
    See Project
  • AI-generated apps that pass security review Icon
    AI-generated apps that pass security review

    Stop waiting on engineering. Build production-ready internal tools with AI—on your company data, in your cloud.

    Retool lets you generate dashboards, admin panels, and workflows directly on your data. Type something like “Build me a revenue dashboard on my Stripe data” and get a working app with security, permissions, and compliance built in from day one. Whether on our cloud or self-hosted, create the internal software your team needs without compromising enterprise standards or control.
    Try Retool free
  • 5
    CodeGeeX

    CodeGeeX

    CodeGeeX: An Open Multilingual Code Generation Model (KDD 2023)

    CodeGeeX is a large-scale multilingual code generation model with 13 billion parameters, trained on 850B tokens across more than 20 programming languages. Developed with MindSpore and later made PyTorch-compatible, it is capable of multilingual code generation, cross-lingual code translation, code completion, summarization, and explanation. It has been benchmarked on HumanEval-X, a multilingual program synthesis benchmark introduced alongside the model, and achieves state-of-the-art performance compared to other open models like InCoder and CodeGen. CodeGeeX also powers IDE plugins for VS Code and JetBrains, offering features like code completion, translation, debugging, and annotation. The model supports Ascend 910 and NVIDIA GPUs, with optimizations like quantization and FasterTransformer acceleration for faster inference.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 6
    ChatGLM.cpp

    ChatGLM.cpp

    C++ implementation of ChatGLM-6B & ChatGLM2-6B & ChatGLM3 & GLM4(V)

    ChatGLM.cpp is a C++ implementation of the ChatGLM-6B model, enabling efficient local inference without requiring a Python environment. It is optimized for running on consumer hardware.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 7
    Emscripten

    Emscripten

    Emscripten: An LLVM-to-WebAssembly Compiler

    Emscripten is a complete open-source compiler toolchain that transforms C, C++, and other LLVM-based source code into WebAssembly (and JavaScript), enabling native‑like applications to run in web browsers, Node.js, and other Wasm environments. While Emscripten mostly focuses on compiling C and C++ using Clang, it can be integrated with other LLVM-using compilers (for example, Rust has Emscripten integration, with the wasm32-unknown-emscripten and asmjs-unknown-emscripten targets). Emscripten provides Web support for popular portable APIs such as OpenGL and SDL2, allowing complex graphical native applications to be ported, such as the Unity game engine and Google Earth. It can probably port your codebase, too.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 8
    rwkv.cpp

    rwkv.cpp

    INT4/INT5/INT8 and FP16 inference on CPU for RWKV language model

    Besides the usual FP32, it supports FP16, quantized INT4, INT5 and INT8 inference. This project is focused on CPU, but cuBLAS is also supported. RWKV is a novel large language model architecture, with the largest model in the family having 14B parameters. In contrast to Transformer with O(n^2) attention, RWKV requires only state from the previous step to calculate logits. This makes RWKV very CPU-friendly on large context lengths.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 9
    DomE

    DomE

    Implements a reference architecture for creating information systems

    DomE Experiment is an implementation of a reference architecture for creating information systems from the automated evolution of the domain model. The architecture comprises elements that guarantee user access through automatically generated interfaces for various devices, integration with external information sources, data and operations security, automatic generation of analytical information, and automatic control of business processes. All these features are generated from the domain model, which is, in turn, continuously evolved from interactions with the user or autonomously by the system itself. Thus, an alternative to the traditional software production processes is proposed, which involves several stages and different actors, sometimes demanding a lot of time and money without obtaining the expected result. With software engineering techniques, self-adaptive systems, and artificial intelligence, it is possible, the integration between design time and execution time.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Deploy Apps in Seconds with Cloud Run Icon
    Deploy Apps in Seconds with Cloud Run

    Host and run your applications without the need to manage infrastructure. Scales up from and down to zero automatically.

    Cloud Run is the fastest way to deploy containerized apps. Push your code in Go, Python, Node.js, Java, or any language and Cloud Run builds and deploys it automatically. Get fast autoscaling, pay only when your code runs, and skip the infrastructure headaches. Two million requests free per month. And new customers get $300 in free credit.
    Try Cloud Run Free
  • Previous
  • You're on page 1
  • Next
MongoDB Logo MongoDB