| Name | Modified | Size | Downloads / Week |
|---|---|---|---|
| Parent folder | |||
| README.md | 2024-03-01 | 1.0 kB | |
| v0.7.1_ patch release source code.tar.gz | 2024-03-01 | 7.5 MB | |
| v0.7.1_ patch release source code.zip | 2024-03-01 | 7.6 MB | |
| Totals: 3 Items | 15.0 MB | 1 | |
Support loading sharded quantized checkpoints
Sharded checkpoints can now be loaded in the from_quantized method.
- Support loading sharded quantized checkpoints. by @LaaZa in https://github.com/AutoGPTQ/AutoGPTQ/pull/425
Gemma GPTQ quantization
Gemma model can be quantized with AutoGPTQ.
- Add support for Gemma models. by @LaaZa in https://github.com/AutoGPTQ/AutoGPTQ/pull/561
Other changes and fixes
- Add back missing import by @fxmarty in https://github.com/AutoGPTQ/AutoGPTQ/pull/553
- Fix bias materialization for Marlin by @fxmarty in https://github.com/AutoGPTQ/AutoGPTQ/pull/554
- Fix shape check marlin by @fxmarty in https://github.com/AutoGPTQ/AutoGPTQ/pull/557
- Explicitely check compute capability in marlin's QLinear by @fxmarty in https://github.com/AutoGPTQ/AutoGPTQ/pull/567
- Compatibility with latest transformers by @fxmarty in https://github.com/AutoGPTQ/AutoGPTQ/pull/573
Full Changelog: https://github.com/AutoGPTQ/AutoGPTQ/compare/v0.7.0...v0.7.1