AutoGPTQ - Browse /v0.5.0 at SourceForge.net

The interactive file manager requires Javascript. Please enable it or use sftp or scp.
You may still browse the files here.

Name	Modified	Size	InfoDownloads / Week
Parent folder
README.md	2023-11-02	4.3 kB	0
v0.5.0_ Exllama v2 GPTQ kernels, RoCm 5.6_5.7 support, many bugfixes source code.tar.gz	2023-11-02	7.4 MB	1
v0.5.0_ Exllama v2 GPTQ kernels, RoCm 5.6_5.7 support, many bugfixes source code.zip	2023-11-02	7.5 MB	0
Totals: 3 Items		14.9 MB	1

Exllama v2 GPTQ kernel support

The more performant GPTQ kernels from @turboderp's exllamav2 library are now available directly in AutoGPTQ, and are the default backend choice.

A comprehensive benchmark is available here.

exllamav2 integration by @SunMarc in https://github.com/PanQiWei/AutoGPTQ/pull/349

CPU inference support

This is experimental.

Add AutoGPTQ's cpu kernel. by @qwopqwop200 in https://github.com/PanQiWei/AutoGPTQ/pull/245

Loading from safetensors is now the default

Allow using a model with basename model, use_safetensors defaults to True by @fxmarty in https://github.com/PanQiWei/AutoGPTQ/pull/383

Falcon, Mistral support

Add support for Falcon as part of Transformers 4.33.0, including new Falcon 180B by @TheBloke in https://github.com/PanQiWei/AutoGPTQ/pull/326
Add support for Mistral models. by @LaaZa in https://github.com/PanQiWei/AutoGPTQ/pull/362

Other changes and bugfixes

Fix setuptools classifier by @fxmarty in https://github.com/PanQiWei/AutoGPTQ/pull/285
Update install instructions by @fxmarty in https://github.com/PanQiWei/AutoGPTQ/pull/286
Install skip qigen(windows) by @qwopqwop200 in https://github.com/PanQiWei/AutoGPTQ/pull/309
fix model type changed after calling .to() method by @PanQiWei in https://github.com/PanQiWei/AutoGPTQ/pull/310
Update qwen.py for Qwen-VL by @JustinLin610 in https://github.com/PanQiWei/AutoGPTQ/pull/303
fix typo in max_input_length by @SunMarc in https://github.com/PanQiWei/AutoGPTQ/pull/311
Use adapter_name for get_gptq_peft_model with train_mode=True by @alex4321 in https://github.com/PanQiWei/AutoGPTQ/pull/347
Ignore unknown parameters in quantize_config.json by @z80maniac in https://github.com/PanQiWei/AutoGPTQ/pull/335
fix bug(breaking change) remove (zeors -= 1) by @qwopqwop200 in https://github.com/PanQiWei/AutoGPTQ/pull/325
Revert "fix bug(breaking change) remove (zeors -= 1)" by @PanQiWei in https://github.com/PanQiWei/AutoGPTQ/pull/354
import exllama QuantLinear instead of exllamav2's in pack_model by @PanQiWei in https://github.com/PanQiWei/AutoGPTQ/pull/355
Modify qlinear_cuda for tracing the GPTQ model by @vivekkhandelwal1 in https://github.com/PanQiWei/AutoGPTQ/pull/367
Fix QiGen kernel generation by @fxmarty in https://github.com/PanQiWei/AutoGPTQ/pull/379
Improve RoCm support by @fxmarty in https://github.com/PanQiWei/AutoGPTQ/pull/382
PEFT initialization fix by @alex4321 in https://github.com/PanQiWei/AutoGPTQ/pull/361
Pin to accelerate>=0.22 by @fxmarty in https://github.com/PanQiWei/AutoGPTQ/pull/384
Fix overflow in exllama with act-order by @fxmarty in https://github.com/PanQiWei/AutoGPTQ/pull/386
Default to exllama kernel when exllama v2 is disabled by @fxmarty in https://github.com/PanQiWei/AutoGPTQ/pull/387
Error out on exllama_set_max_input_length call without exllama backend by @fxmarty in https://github.com/PanQiWei/AutoGPTQ/pull/389
Add fix for CPU Inference by @vivekkhandelwal1 in https://github.com/PanQiWei/AutoGPTQ/pull/385
Fix dtype issues and add relevant tests by @fxmarty in https://github.com/PanQiWei/AutoGPTQ/pull/393
Patch accelerate to use correct dtype by @fxmarty in https://github.com/PanQiWei/AutoGPTQ/pull/394
Fixed missing cstdint include by @kodai2199 in https://github.com/PanQiWei/AutoGPTQ/pull/388
Update RoCm workflow to build for RoCm 5.7 by @fxmarty in https://github.com/PanQiWei/AutoGPTQ/pull/395
Fix Windows build by @fxmarty in https://github.com/PanQiWei/AutoGPTQ/pull/396

New Contributors

@JustinLin610 made their first contribution in https://github.com/PanQiWei/AutoGPTQ/pull/303
@SunMarc made their first contribution in https://github.com/PanQiWei/AutoGPTQ/pull/311
@alex4321 made their first contribution in https://github.com/PanQiWei/AutoGPTQ/pull/347
@vivekkhandelwal1 made their first contribution in https://github.com/PanQiWei/AutoGPTQ/pull/367
@kodai2199 made their first contribution in https://github.com/PanQiWei/AutoGPTQ/pull/388

Full Changelog: https://github.com/PanQiWei/AutoGPTQ/compare/v0.4.2...v0.5.0

Source: README.md, updated 2023-11-02

AutoGPTQ Files

An easy-to-use LLMs quantization package with user-friendly apis

Exllama v2 GPTQ kernel support

CPU inference support

Loading from safetensors is now the default

Falcon, Mistral support

Other changes and bugfixes

New Contributors

AutoGPTQ Files

An easy-to-use LLMs quantization package with user-friendly apis

Get an email when there's a new version of AutoGPTQ

Exllama v2 GPTQ kernel support

CPU inference support

Loading from safetensors is now the default

Falcon, Mistral support

Other changes and bugfixes

New Contributors