ElizaWszola
|
9239bf718e
|
[Kernel] CUTLASS grouped gemm fp8 MoE kernel (#13972)
Signed-off-by: ElizaWszola <eliza@neuralmagic.com>
Signed-off-by: ElizaWszola <ewszola@redhat.com>
Co-authored-by: Lucas Wilkinson <wilkinson.lucas@gmail.com>
|
2025-03-27 00:54:44 +00:00 |
|
Lucas Wilkinson
|
9798b2fb00
|
[Kernel] Update cutlass_scaled_mm to support 2d group (blockwise) scaling (#11868)
|
2025-01-30 18:33:00 -08:00 |
|
Harry Mellor
|
3ea7b94523
|
Move linting to pre-commit (#11975)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-01-20 14:58:01 +08:00 |
|
Dipika Sikka
|
60508ffda9
|
[Kernel]: Cutlass 2:4 Sparsity + FP8/Int8 Quant Support (#10995)
Co-authored-by: Faraz Shahsavan <faraz.shahsavan@gmail.com>
Co-authored-by: ilmarkov <markovilya197@gmail.com>
Co-authored-by: Rahul Tuli <rahul@neuralmagic.com>
Co-authored-by: rshaw@neuralmagic.com <rshaw@neuralmagic.com>
|
2024-12-18 09:57:16 -05:00 |
|