Cyrus Leung
|
6ffa3f314c
|
[CI/Build] Avoid CUDA initialization (#8534)
|
2024-09-18 10:38:11 +00:00 |
|
Charlie Fu
|
e837b624f2
|
[Feature][Hardware][Amd] Add fp8 Linear Layer for Rocm (#7210)
|
2024-08-16 10:06:30 -07:00 |
|
jon-chuang
|
50b8d08dbd
|
[Misc/Testing] Use torch.testing.assert_close (#7324)
|
2024-08-16 04:24:04 +00:00 |
|
Tyler Michael Smith
|
fea59c7712
|
[Bugfix][Kernel] Use int64_t for indices in fp8 quant kernels (#6649)
|
2024-07-22 14:08:30 -06:00 |
|
Varun Sundar Rabindranath
|
2e26564259
|
[ Kernel ] FP8 Dynamic Per Token Quant - Add scale_ub (#6593)
Co-authored-by: Varun Sundar Rabindranth <varun@neuralmagic.com>
|
2024-07-19 18:15:26 -07:00 |
|
Robert Shaw
|
4cc24f01b1
|
[ Kernel ] Enable Dynamic Per Token fp8 (#6547)
|
2024-07-19 23:08:15 +00:00 |
|
Varun Sundar Rabindranath
|
b5241e41d9
|
[ Kernel ] FP8 Dynamic-Per-Token Quant Kernel (#6511)
Co-authored-by: Varun Sundar Rabindranath <varun@neuralmagic.com>
|
2024-07-18 01:38:35 +00:00 |
|