Michael Goin
|
5f6d10c14c
|
[CI/Build] Enforce style for C++ and CUDA code with clang-format (#4722)
|
2024-05-22 07:18:41 +00:00 |
|
Alexander Matveev
|
da5a0b539d
|
Remove marlin warning (#4918)
|
2024-05-20 14:55:34 +00:00 |
|
Jinzhen Lin
|
99caa49106
|
[Kernel] add bfloat16 support for gptq marlin kernel (#4788)
|
2024-05-16 09:55:29 -04:00 |
|
alexm-nm
|
e288df0632
|
[Bugfix] Fine-tune gptq_marlin configs to be more similar to marlin (#4626)
|
2024-05-08 17:14:31 -07:00 |
|
alexm-nm
|
7038e8b803
|
[Kernel] Support running GPTQ 8-bit models in Marlin (#4533)
|
2024-05-02 12:56:22 -04:00 |
|
Robert Shaw
|
73c8d677e5
|
[Kernel] Marlin Expansion: Support AutoGPTQ Models with Marlin (#3922)
Co-authored-by: alexm <alexm@neuralmagic.com>
Co-authored-by: mgoin <michael@neuralmagic.com>
|
2024-04-29 09:35:34 -07:00 |
|