Tyler Michael Smith
cfa134d247
[Bugfix/CI] Fixup benchmark_moe.py ( #12562 )
...
Fixes `is_marlin` not being passed into `get_default_config`
Also allow `--tensor-parallel-size` in addition to `-tp` and `--tp-size`
Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>
2025-02-01 13:41:35 +08:00
Divakar Verma
1c1bb0bbf2
[Misc][MoE] add Deepseek-V3 moe tuning support ( #12558 )
...
Signed-off-by: Divakar Verma <divakar.verma@amd.com>
2025-01-30 00:47:30 +00:00
Divakar Verma
2acba47d9b
[bugfix] moe tuning. rm is_navi() ( #12273 )
...
Signed-off-by: Divakar Verma <divakar.verma@amd.com>
2025-01-21 22:47:32 +00:00
Divakar Verma
8027a72461
[ROCm][MoE] moe tuning support for rocm ( #12049 )
...
Signed-off-by: Divakar Verma <divakar.verma@amd.com>
2025-01-17 14:49:16 +08:00
wangshuai09
622b7ab955
[Hardware] using current_platform.seed_everything ( #9785 )
...
Signed-off-by: wangshuai09 <391746016@qq.com>
2024-10-29 14:47:44 +00:00
youkaichao
32176fee73
[torch.compile] support moe models ( #9632 )
...
Signed-off-by: youkaichao <youkaichao@gmail.com>
2024-10-27 21:58:04 -07:00
Cyrus Leung
6ffa3f314c
[CI/Build] Avoid CUDA initialization ( #8534 )
2024-09-18 10:38:11 +00:00
Mor Zusman
7fc23be81c
[Kernel] W8A16 Int8 inside FusedMoE ( #7415 )
2024-08-16 10:06:51 -07:00
Michael Goin
8065a7e220
[Frontend] Add FlexibleArgumentParser to support both underscore and dash in names ( #5718 )
2024-06-20 17:00:13 -06:00
Cyrus Leung
0e9164b40a
[mypy] Enable type checking for test directory ( #5017 )
2024-06-15 04:45:31 +00:00
Philipp Moritz
51a08e7d8f
[Kernel] Re-tune Mixtral MoE configurations for FP8 on H100 ( #5238 )
2024-06-05 10:59:14 -07:00
Woosuk Kwon
27208be66e
[Kernel] Add back batch size 1536 and 3072 to MoE tuning ( #5242 )
2024-06-04 09:58:47 -07:00
Woosuk Kwon
3a434b07ed
[Kernel] Enhance MoE benchmarking & tuning script ( #4921 )
2024-06-03 20:06:59 -07:00