SGLang

SGLang is a fast serving framework for large language models and vision language models. It makes your interaction with models faster and more controllable by co-designing the backend runtime and frontend language.

Features

Provides efficient serving with RadixAttention for prefix caching, jump-forward constrained decoding, overhead-free CPU scheduler, continuous batching, token attention (paged attention), tensor parallelism, FlashInfer kernels, chunked prefill, and quantization (FP8/INT4/AWQ/GPTQ)
Offers an intuitive interface for programming LLM applications, including chained generation calls, advanced prompting, control flow, multi-modal inputs, parallelism, and external interactions
Supports a wide range of generative models (Llama, Gemma, Mistral, QWen, DeepSeek, LLaVA, etc.), embedding models (e5-mistral, gte) and reward models (Skywork), with easy extensibility for integrating new models
SGLang is open-source and backed by an active community with industry adoption
Documentation available

Project Samples

Project Activity

See All Activity >

License

Apache License V2.0

Follow SGLang

SGLang Web Site

Other Useful Business Software

Crowdtesting That Delivers | Testeum

Unfixed bugs delaying your launch? Test with real users globally – check it out for free, results in days.

Testeum connects your software, app, or website to a worldwide network of testers, delivering detailed feedback in under 48 hours. Ensure functionality and refine UX on real devices, all at a fraction of traditional costs. Trusted by startups and enterprises alike, our platform streamlines quality assurance with actionable insights.

Click to perfect your product now.

Rate This Project

User Reviews

Be the first to post a review of SGLang!

Additional Project Details

Operating Systems

Linux, Mac, Windows

Programming Language

Python

Related Categories

Python Artificial Intelligence Software

Registered

2025-02-18

Similar Business Software

Hugging Face Transformers

Transformers is a library of pretrained natural language processing, computer vision, audio, and multimodal models for inference and training. Use Transformers to train models on your data, build inference applications, and generate text with large language models. Explore the Hugging Face Hub...

See Software
LM-Kit.NET

LM-Kit.NET is a cutting-edge, high-level inference SDK designed specifically to bring the advanced capabilities of Large Language Models (LLM) into the C# ecosystem. Tailored for developers working within .NET, LM-Kit.NET provides a comprehensive suite of powerful Generative AI tools, making...

See Software
Vertex AI

Build, deploy, and scale machine learning (ML) models faster, with fully managed ML tools for any use case. Through Vertex AI Workbench, Vertex AI is natively integrated with BigQuery, Dataproc, and Spark. You can use BigQuery ML to create and execute machine learning models in BigQuery...

See Software

Report inappropriate content

SGLang

SGLang is a fast serving framework for large language models

Get an email when there's a new version of SGLang

Features

Project Samples

Project Activity

Categories

License

Follow SGLang

User Reviews

Additional Project Details

Operating Systems

Programming Language

Related Categories

Registered