Port of Facebook's LLaMA model in C/C++
Python bindings for llama.cpp
Run Local LLMs on Any Device. Open-source
Maid is a cross-platform Flutter app for interfacing with GGUF
Qwen3 is the large language model series developed by Qwen team
Distribute and run LLMs with a single file
The simplest way to run Alpaca on your own computer
C#/.NET binding of llama.cpp, including LLaMa/GPT model inference
A gradio web UI for running Large Language Models like LLaMA
React and Electron-based app that executes the FreedomGPT LLM locally
DevoxxGenie is a plugin for IntelliJ IDEA that uses local LLM's
Inference Llama 2 in one file of pure C
An easy-to-understand framework for LLM samplers
GLM-4 series: Open Multilingual Multimodal Chat LMs
Run any Llama 2 locally with gradio UI on GPU or CPU from anywhere
Open source large-language-model based code completion engine
Chinese LLaMA & Alpaca large language model + local CPU/GPU training
llama.go is like llama.cpp in pure Golang
Chat with your favourite LLaMA models in a native macOS app
Run GGUF models easily with a UI or API. One File. Zero Install.
Locally run an Instruction-Tuned Chat-Style LLM
JetBrains’ 4B parameter code model for completions
Jan-v1-edge: efficient 1.7B reasoning model optimized for edge devices