Download Latest Version linux-64-torchserve-0.12.0-py311_0.tar.bz2 (42.3 MB)
Email in envelope

Get an email when there's a new version of TorchServe

Home / v0.12.0
Name Modified Size InfoDownloads / Week
Parent folder
osx-arm64-torchserve-0.12.0-py39_0.tar.bz2 2024-09-30 42.3 MB
osx-arm64-torchserve-0.12.0-py311_0.tar.bz2 2024-09-30 42.3 MB
linux-aarch64-torchserve-0.12.0-py310_0.tar.bz2 2024-09-30 42.3 MB
osx-arm64-torchserve-0.12.0-py38_0.tar.bz2 2024-09-30 42.3 MB
linux-aarch64-torchserve-0.12.0-py38_0.tar.bz2 2024-09-30 42.3 MB
osx-arm64-torchserve-0.12.0-py310_0.tar.bz2 2024-09-30 42.3 MB
linux-aarch64-torchserve-0.12.0-py311_0.tar.bz2 2024-09-30 42.3 MB
linux-64-torchserve-0.12.0-py311_0.tar.bz2 2024-09-30 42.3 MB
linux-aarch64-torchserve-0.12.0-py39_0.tar.bz2 2024-09-30 42.3 MB
linux-64-torchserve-0.12.0-py310_0.tar.bz2 2024-09-30 42.3 MB
linux-64-torchserve-0.12.0-py39_0.tar.bz2 2024-09-30 42.3 MB
torchserve-0.12.0-py3-none-any.whl 2024-09-30 42.2 MB
linux-64-torchserve-0.12.0-py38_0.tar.bz2 2024-09-30 42.3 MB
README.md 2024-09-24 7.5 kB
TorchServe v0.12.0 Release Notes source code.tar.gz 2024-09-24 63.4 MB
TorchServe v0.12.0 Release Notes source code.zip 2024-09-24 64.1 MB
Totals: 16 Items   676.9 MB 0

Highlights Include

  • GenAI updates
    • No code LLM deployments with TorchServe + vLLM & TensorRT-LLM using ts.llm_launcher script
    • OpenAI API support for TorchServe + vLLM
    • Integration of TensorRT-LLM engine
    • Stateful Inference on AWS Sagemaker (see blog)
  • Support for linux-aarch64
    • CI & nightly regression added
    • Publish docker & KServe images
  • PyTorch updates
    • Support for PyTorch 2.4
    • Deprecation of TorchText

PyTorch Updates

GenAI

Support for linux-aarch64

Documentation

Improvements and Bug Fixing

New Contributors

Platform Support

Ubuntu 20.04 MacOS 10.14+, Windows 10 Pro, Windows Server 2019, Windows subsystem for Linux (Windows Server 2019, WSLv1, Ubuntu 18.0.4). TorchServe requires Python >= 3.8 and JDK17.

GPU Support Matrix

TorchServe version PyTorch version Python Stable CUDA Experimental CUDA
0.12.0 2.4.0 >=3.8, <=3.11 CUDA 11.8, CUDNN 8.7.0.84 CUDA 12.1, CUDNN 8.9.2.26
0.11.1 2.3.0 >=3.8, <=3.11 CUDA 11.8, CUDNN 8.7.0.84 CUDA 12.1, CUDNN 8.9.2.26
0.11.0 2.3.0 >=3.8, <=3.11 CUDA 11.8, CUDNN 8.7.0.84 CUDA 12.1, CUDNN 8.9.2.26
0.10.0 2.2.1 >=3.8, <=3.11 CUDA 11.8, CUDNN 8.7.0.84 CUDA 12.1, CUDNN 8.9.2.26
0.9.0 2.1 >=3.8, <=3.11 CUDA 11.8, CUDNN 8.7.0.84 CUDA 12.1, CUDNN 8.9.2.26
0.8.0 2.0 >=3.8, <=3.11 CUDA 11.7, CUDNN 8.5.0.96 CUDA 11.8, CUDNN 8.7.0.84
0.7.0 1.13 >=3.7, <=3.10 CUDA 11.6, CUDNN 8.3.2.44 CUDA 11.7, CUDNN 8.5.0.96

Inferentia2 Support Matrix

TorchServe version PyTorch version Python Neuron SDK
0.12.0 2.1 >=3.8, <=3.11 2.18.2+
0.11.1 2.1 >=3.8, <=3.11 2.18.2+
0.11.0 2.1 >=3.8, <=3.11 2.18.2+
0.10.0 1.13 >=3.8, <=3.11 2.16+
0.9.0 1.13 >=3.8, <=3.11 2.13.2+
Source: README.md, updated 2024-09-24