20231088/vllm - vllm - Luminance Code Repo

20231088/vllm

Author	SHA1	Message	Date
Divakar Verma	2acba47d9b	[bugfix] moe tuning. rm is_navi() (#12273 ) Signed-off-by: Divakar Verma <divakar.verma@amd.com>	2025-01-21 22:47:32 +00:00
Divakar Verma	8027a72461	[ROCm][MoE] moe tuning support for rocm (#12049 ) Signed-off-by: Divakar Verma <divakar.verma@amd.com>	2025-01-17 14:49:16 +08:00
wangshuai09	622b7ab955	[Hardware] using current_platform.seed_everything (#9785 ) Signed-off-by: wangshuai09 <391746016@qq.com>	2024-10-29 14:47:44 +00:00
youkaichao	32176fee73	[torch.compile] support moe models (#9632 ) Signed-off-by: youkaichao <youkaichao@gmail.com>	2024-10-27 21:58:04 -07:00
Cyrus Leung	6ffa3f314c	[CI/Build] Avoid CUDA initialization (#8534 )	2024-09-18 10:38:11 +00:00
Mor Zusman	7fc23be81c	[Kernel] W8A16 Int8 inside FusedMoE (#7415 )	2024-08-16 10:06:51 -07:00
Michael Goin	8065a7e220	[Frontend] Add FlexibleArgumentParser to support both underscore and dash in names (#5718 )	2024-06-20 17:00:13 -06:00
Cyrus Leung	0e9164b40a	[mypy] Enable type checking for test directory (#5017 )	2024-06-15 04:45:31 +00:00
Philipp Moritz	51a08e7d8f	[Kernel] Re-tune Mixtral MoE configurations for FP8 on H100 (#5238 )	2024-06-05 10:59:14 -07:00
Woosuk Kwon	27208be66e	[Kernel] Add back batch size 1536 and 3072 to MoE tuning (#5242 )	2024-06-04 09:58:47 -07:00
Woosuk Kwon	3a434b07ed	[Kernel] Enhance MoE benchmarking & tuning script (#4921 )	2024-06-03 20:06:59 -07:00