| Name | Modified | Size | Downloads / Week |
|---|---|---|---|
| Parent folder | |||
| README.md | 2023-11-02 | 4.3 kB | |
| v0.5.0_ Exllama v2 GPTQ kernels, RoCm 5.6_5.7 support, many bugfixes source code.tar.gz | 2023-11-02 | 7.4 MB | |
| v0.5.0_ Exllama v2 GPTQ kernels, RoCm 5.6_5.7 support, many bugfixes source code.zip | 2023-11-02 | 7.5 MB | |
| Totals: 3 Items | 14.9 MB | 1 | |
Exllama v2 GPTQ kernel support
The more performant GPTQ kernels from @turboderp's exllamav2 library are now available directly in AutoGPTQ, and are the default backend choice.
A comprehensive benchmark is available here.
- exllamav2 integration by @SunMarc in https://github.com/PanQiWei/AutoGPTQ/pull/349
CPU inference support
This is experimental.
- Add AutoGPTQ's cpu kernel. by @qwopqwop200 in https://github.com/PanQiWei/AutoGPTQ/pull/245
Loading from safetensors is now the default
- Allow using a model with basename
model, use_safetensors defaults to True by @fxmarty in https://github.com/PanQiWei/AutoGPTQ/pull/383
Falcon, Mistral support
- Add support for Falcon as part of Transformers 4.33.0, including new Falcon 180B by @TheBloke in https://github.com/PanQiWei/AutoGPTQ/pull/326
- Add support for Mistral models. by @LaaZa in https://github.com/PanQiWei/AutoGPTQ/pull/362
Other changes and bugfixes
- Fix setuptools classifier by @fxmarty in https://github.com/PanQiWei/AutoGPTQ/pull/285
- Update install instructions by @fxmarty in https://github.com/PanQiWei/AutoGPTQ/pull/286
- Install skip qigen(windows) by @qwopqwop200 in https://github.com/PanQiWei/AutoGPTQ/pull/309
- fix model type changed after calling .to() method by @PanQiWei in https://github.com/PanQiWei/AutoGPTQ/pull/310
- Update qwen.py for Qwen-VL by @JustinLin610 in https://github.com/PanQiWei/AutoGPTQ/pull/303
- fix typo in max_input_length by @SunMarc in https://github.com/PanQiWei/AutoGPTQ/pull/311
- Use
adapter_nameforget_gptq_peft_modelwithtrain_mode=Trueby @alex4321 in https://github.com/PanQiWei/AutoGPTQ/pull/347 - Ignore unknown parameters in quantize_config.json by @z80maniac in https://github.com/PanQiWei/AutoGPTQ/pull/335
- fix bug(breaking change) remove (zeors -= 1) by @qwopqwop200 in https://github.com/PanQiWei/AutoGPTQ/pull/325
- Revert "fix bug(breaking change) remove (zeors -= 1)" by @PanQiWei in https://github.com/PanQiWei/AutoGPTQ/pull/354
- import exllama QuantLinear instead of exllamav2's in
pack_modelby @PanQiWei in https://github.com/PanQiWei/AutoGPTQ/pull/355 - Modify qlinear_cuda for tracing the GPTQ model by @vivekkhandelwal1 in https://github.com/PanQiWei/AutoGPTQ/pull/367
- Fix QiGen kernel generation by @fxmarty in https://github.com/PanQiWei/AutoGPTQ/pull/379
- Improve RoCm support by @fxmarty in https://github.com/PanQiWei/AutoGPTQ/pull/382
- PEFT initialization fix by @alex4321 in https://github.com/PanQiWei/AutoGPTQ/pull/361
- Pin to accelerate>=0.22 by @fxmarty in https://github.com/PanQiWei/AutoGPTQ/pull/384
- Fix overflow in exllama with act-order by @fxmarty in https://github.com/PanQiWei/AutoGPTQ/pull/386
- Default to exllama kernel when exllama v2 is disabled by @fxmarty in https://github.com/PanQiWei/AutoGPTQ/pull/387
- Error out on exllama_set_max_input_length call without exllama backend by @fxmarty in https://github.com/PanQiWei/AutoGPTQ/pull/389
- Add fix for CPU Inference by @vivekkhandelwal1 in https://github.com/PanQiWei/AutoGPTQ/pull/385
- Fix dtype issues and add relevant tests by @fxmarty in https://github.com/PanQiWei/AutoGPTQ/pull/393
- Patch accelerate to use correct dtype by @fxmarty in https://github.com/PanQiWei/AutoGPTQ/pull/394
- Fixed missing cstdint include by @kodai2199 in https://github.com/PanQiWei/AutoGPTQ/pull/388
- Update RoCm workflow to build for RoCm 5.7 by @fxmarty in https://github.com/PanQiWei/AutoGPTQ/pull/395
- Fix Windows build by @fxmarty in https://github.com/PanQiWei/AutoGPTQ/pull/396
New Contributors
- @JustinLin610 made their first contribution in https://github.com/PanQiWei/AutoGPTQ/pull/303
- @SunMarc made their first contribution in https://github.com/PanQiWei/AutoGPTQ/pull/311
- @alex4321 made their first contribution in https://github.com/PanQiWei/AutoGPTQ/pull/347
- @vivekkhandelwal1 made their first contribution in https://github.com/PanQiWei/AutoGPTQ/pull/367
- @kodai2199 made their first contribution in https://github.com/PanQiWei/AutoGPTQ/pull/388
Full Changelog: https://github.com/PanQiWei/AutoGPTQ/compare/v0.4.2...v0.5.0