5 Commits

Author SHA1 Message Date
Michael Goin
5f6d10c14c
[CI/Build] Enforce style for C++ and CUDA code with clang-format (#4722) 2024-05-22 07:18:41 +00:00
wangding zeng
5d60def02c
DeepseekMoE support with Fused MoE kernel (#2453)
Co-authored-by: roy <jasonailu87@gmail.com>
2024-01-29 21:19:48 -08:00
zhaoyang-star
9090bf02e7
Support FP8-E5M2 KV Cache (#2279)
Co-authored-by: zhaoyang <zhao.yang16@zte.com.cn>
Co-authored-by: Zhuohan Li <zhuohan123@gmail.com>
2024-01-28 16:43:54 -08:00
Mingcan Xiang
614856da25
Avoid multiple redefinition (#1817) 2023-12-14 09:35:58 -08:00
Woosuk Kwon
8ce9c50d40
Avoid compiling kernels for double data type (#933) 2023-09-02 14:59:47 +09:00