Yang Chen
|
95460fc513
|
[Kernel] port sgl moe_align_block_size kernels (#12574)
sgl_moe_align_block_size is based on:
ded9fcd09a
moe_align_block_size is based on:
ba5112ff69
Signed-off-by: Yang Chen <yangche@fb.com>
|
2025-02-03 13:09:50 +08:00 |
|
Charlie Fu
|
59449095ab
|
[Performance][Kernel] Fused_moe Performance Improvement (#9384)
Signed-off-by: charlifu <charlifu@amd.com>
|
2024-10-24 15:37:52 -07:00 |
|
bnellnm
|
5467ac3196
|
[Kernel][Misc] Use TORCH_LIBRARY instead of PYBIND11_MODULE for custom ops (#5047)
|
2024-06-09 16:23:30 -04:00 |
|
Michael Goin
|
5f6d10c14c
|
[CI/Build] Enforce style for C++ and CUDA code with clang-format (#4722)
|
2024-05-22 07:18:41 +00:00 |
|
Woosuk Kwon
|
f0d4e14557
|
Add fused top-K softmax kernel for MoE (#2769)
|
2024-02-05 17:38:02 -08:00 |
|