Name | Modified | Size | Downloads / Week |
---|---|---|---|
Parent folder | |||
README.md | 2025-08-03 | 3.9 kB | |
v1.8.1 source code.tar.gz | 2025-08-03 | 33.1 MB | |
v1.8.1 source code.zip | 2025-08-03 | 33.9 MB | |
Totals: 3 Items | 67.1 MB | 5 |
What's new in 1.8.1 (2025-08-03)
These are the changes in inference v1.8.1.
New features
- FEAT: kokoro mlx support by @qinxuye in https://github.com/xorbitsai/inference/pull/3823
- FEAT: Qwen3-Instruct by @Jun-Howie in https://github.com/xorbitsai/inference/pull/3840
- FEAT: [UI] integrate user favorites into feature model output. by @yiboyasss in https://github.com/xorbitsai/inference/pull/3859
- FEAT: support enable virtualenv and specify packages when lauching model by @qinxuye in https://github.com/xorbitsai/inference/pull/3854
- FEAT: [UI] support enable virtualenv and specify packages when lauching model. by @yiboyasss in https://github.com/xorbitsai/inference/pull/3867
- FEAT: setting max_tokens to maximum if not specified by @qinxuye in https://github.com/xorbitsai/inference/pull/3872
- FEAT: [model] support GLM-4.5 series by @qinxuye in https://github.com/xorbitsai/inference/pull/3882
- FEAT: Qwen3-30B-A3B-it by @Jun-Howie in https://github.com/xorbitsai/inference/pull/3886
- FEAT: Support Qwen3-Thinking by @Jun-Howie in https://github.com/xorbitsai/inference/pull/3888
- FEAT: Support Qwen3-Coder by @Jun-Howie in https://github.com/xorbitsai/inference/pull/3889
Enhancements
- ENH: add mlu device check by @nan9126 in https://github.com/xorbitsai/inference/pull/3844
- ENH: Support for the bge-m3 llama.cpp backend by @codingl2k1 in https://github.com/xorbitsai/inference/pull/3861
- ENH: Added mlx support for deepseek-v3-0324 by @uebber in https://github.com/xorbitsai/inference/pull/3864
- ENH: Add context length limits and automatic truncation features to vLLM embedding models. by @amumu96 in https://github.com/xorbitsai/inference/pull/3887
- BLD: remove sglang from pip install xinference[all] due to depedency conflicts with vllm by @qinxuye in https://github.com/xorbitsai/inference/pull/3865
- BLD: upgrade base image for dockerfile by @zwt-1234 in https://github.com/xorbitsai/inference/pull/3318
- BLD: change dokcer build time to 240 minutes to pass 12.8 build by @qinxuye in https://github.com/xorbitsai/inference/pull/3892
- REF: add ui module that includes web and gradio UIs. by @qinxuye in https://github.com/xorbitsai/inference/pull/3819
- REF: move continuous batching scheduler into model by @qinxuye in https://github.com/xorbitsai/inference/pull/3824
Bug fixes
- BUG: Fixed an error when using structured output in sglang [#3825] by @aniya105 in https://github.com/xorbitsai/inference/pull/3826
- BUG: fix compatibility for old vllm by @qinxuye in https://github.com/xorbitsai/inference/pull/3838
- BUG: Fix abnormal GPU memory usage in Qwen3 Reranker by @JDanielWu in https://github.com/xorbitsai/inference/pull/3846
- BUG: fix compatibility with vllm 0.10.0 by @qinxuye in https://github.com/xorbitsai/inference/pull/3875
- BUG: fix version checks for vllm by @qinxuye in https://github.com/xorbitsai/inference/pull/3891
Documentation
- DOC: add experimental feature for virtualenv by @qinxuye in https://github.com/xorbitsai/inference/pull/3818
- DOC: add doc about model virtual env settings when lauching model by @qinxuye in https://github.com/xorbitsai/inference/pull/3885
Others
- FIX: GLM4.1V Repository URL by @Jun-Howie in https://github.com/xorbitsai/inference/pull/3839
- BLD:fix docker build for cu128 by @zwt-1234 in https://github.com/xorbitsai/inference/pull/3893
- BLD:fix cu128 build by @zwt-1234 in https://github.com/xorbitsai/inference/pull/3895
- CHORE: THUDM has been renamed to zai-org by @Jun-Howie in https://github.com/xorbitsai/inference/pull/3870
New Contributors
- @JDanielWu made their first contribution in https://github.com/xorbitsai/inference/pull/3846
- @uebber made their first contribution in https://github.com/xorbitsai/inference/pull/3864
- @zwt-1234 made their first contribution in https://github.com/xorbitsai/inference/pull/3318
Full Changelog: https://github.com/xorbitsai/inference/compare/v1.8.0...v1.8.1