neural-speed is an innovative library developed by Intel to enhance the efficiency of Large Language Model (LLM) inference through low-bit quantization techniques. By reducing the precision of model weights and activations, neural-speed aims to accelerate inference while maintaining model accuracy, making it suitable for deployment in resource-constrained environments.

Features

  • Low-bit quantization for LLMs​
  • Accelerates inference performance​
  • Maintains model accuracy​
  • Suitable for resource-constrained environments​
  • Developed by Intel​
  • Supports various LLM architectures​
  • Open-source library​
  • Comprehensive documentation​
  • Facilitates efficient AI deployments​

Project Samples

Project Activity

See All Activity >

Categories

LLM Inference

License

Apache License V2.0

Follow Neural Speed

Neural Speed Web Site

Other Useful Business Software
Build AI Apps with Gemini 3 on Vertex AI Icon
Build AI Apps with Gemini 3 on Vertex AI

Access Google’s most capable multimodal models. Train, test, and deploy AI with 200+ foundation models on one platform.

Vertex AI gives developers access to Gemini 3—Google’s most advanced reasoning and coding model—plus 200+ foundation models including Claude, Llama, and Gemma. Build generative AI apps with Vertex AI Studio, customize with fine-tuning, and deploy to production with enterprise-grade MLOps. New customers get $300 in free credits.
Try Vertex AI Free
Rate This Project
Login To Rate This Project

User Reviews

Be the first to post a review of Neural Speed!

Additional Project Details

Operating Systems

Linux, Mac, Windows

Programming Language

C++

Related Categories

C++ LLM Inference Tool

Registered

2025-03-18