jon-chuang
|
50b8d08dbd
|
[Misc/Testing] Use torch.testing.assert_close (#7324)
|
2024-08-16 04:24:04 +00:00 |
|
Lucas Wilkinson
|
a8d604ca2a
|
[Misc] Disambiguate quantized types via a new ScalarType (#6396)
|
2024-08-02 13:51:58 -07:00 |
|
Varun Sundar Rabindranath
|
b5241e41d9
|
[ Kernel ] FP8 Dynamic-Per-Token Quant Kernel (#6511)
Co-authored-by: Varun Sundar Rabindranath <varun@neuralmagic.com>
|
2024-07-18 01:38:35 +00:00 |
|
bnellnm
|
5467ac3196
|
[Kernel][Misc] Use TORCH_LIBRARY instead of PYBIND11_MODULE for custom ops (#5047)
|
2024-06-09 16:23:30 -04:00 |
|
Dipika Sikka
|
ca3ea51bde
|
[Kernel] Dynamic Per-Token Activation Quantization (#5037)
Co-authored-by: Varun Sundar Rabindranath <varunsundar08@gmail.com>
Co-authored-by: Varun Sundar Rabindranath <varun@neuralmagic.com>
|
2024-06-07 09:36:26 -07:00 |
|
Tyler Michael Smith
|
cbb2f59cc8
|
[Kernel] Pass a device pointer into the quantize kernel for the scales (#5159)
|
2024-06-03 09:52:30 -07:00 |
|
Dipika Sikka
|
a1242324c9
|
[Kernel] Initial Activation Quantization Support (#4525)
Co-authored-by: Varun Sundar Rabindranath <varunsundar08@gmail.com>
Co-authored-by: Varun Sundar Rabindranath <varun@neuralmagic.com>
|
2024-05-23 21:29:18 +00:00 |
|