Dipika Sikka
|
5884c2b454
|
[Misc] Update to comply with the new compressed-tensors config (#5350)
Co-authored-by: Michael Goin <michael@neuralmagic.com>
|
2024-06-10 03:49:46 +00:00 |
|
youkaichao
|
8ea5e44a43
|
[CI/Test] improve robustness of test (vllm_runner) (#5357)
[CI/Test] improve robustness of test by replacing del with context manager (vllm_runner) (#5357)
|
2024-06-08 08:59:20 +00:00 |
|
Dipika Sikka
|
ca3ea51bde
|
[Kernel] Dynamic Per-Token Activation Quantization (#5037)
Co-authored-by: Varun Sundar Rabindranath <varunsundar08@gmail.com>
Co-authored-by: Varun Sundar Rabindranath <varun@neuralmagic.com>
|
2024-06-07 09:36:26 -07:00 |
|
Dipika Sikka
|
a1242324c9
|
[Kernel] Initial Activation Quantization Support (#4525)
Co-authored-by: Varun Sundar Rabindranath <varunsundar08@gmail.com>
Co-authored-by: Varun Sundar Rabindranath <varun@neuralmagic.com>
|
2024-05-23 21:29:18 +00:00 |
|