Browse free open source LLM Inference tools and projects for Linux below. Use the toggles on the left to filter open source LLM Inference tools by OS, license, language, programming language, and project status.
Port of OpenAI's Whisper model in C/C++
Run Local LLMs on Any Device. Open-source
Port of Facebook's LLaMA model in C/C++
ONNX Runtime: cross-platform, high performance ML inferencing
User-friendly AI Interface
High-performance neural network inference framework for mobile
Protect and discover secrets using Gitleaks
Self-hosted, community-driven, local OpenAI compatible API
A high-throughput and memory-efficient inference and serving engine
OpenVINO™ Toolkit repository
C++ library for high performance inference on NVIDIA GPUs
Replace OpenAI GPT with another LLM in your app
A RWKV management and startup tool, full automation, only 8MB
State-of-the-art diffusion models for image and audio generation
C++ implementation of ChatGLM-6B & ChatGLM2-6B & ChatGLM3 & GLM4(V)
A Pythonic framework to simplify AI service building
Everything you need to build state-of-the-art foundation models
Library for serving Transformers models on Amazon SageMaker
AIMET is a library that provides advanced quantization and compression
Unified Model Serving Framework
PyTorch library of curated Transformer models and their components
On-device AI across mobile, embedded and edge for PyTorch
LLM.swift is a simple and readable library
Create HTML profiling reports from pandas DataFrame objects
FlashInfer: Kernel Library for LLM Serving