youkaichao
|
cc466a3290
|
[Core][Distributed] support cpu&device in broadcast tensor dict (#4660)
[Core][Distributed] support both cpu and device tensor in broadcast tensor dict (#4660)
|
2024-05-07 19:34:47 -07:00 |
|
youkaichao
|
63e7176f26
|
[Core][Refactor] move parallel_utils into vllm/distributed (#3950)
[WIP][Core][Refactor] move vllm/model_executor/parallel_utils into vllm/distributed and vllm/device_communicators (#3950)
|
2024-04-10 15:33:30 -07:00 |
|
youkaichao
|
756b30a5f3
|
[Core][Test] move local_rank to the last arg with default value(#3711)
[Core][Test] move local_rank to the last arg with default value to keep api compatible (#3711)
|
2024-03-28 21:19:45 -07:00 |
|
SangBin Cho
|
26422e477b
|
[Test] Make model tests run again and remove --forked from pytest (#3631)
Co-authored-by: Simon Mo <simon.mo@hey.com>
|
2024-03-28 21:06:40 -07:00 |
|
Roy
|
515386ef3c
|
[Core] Support multi-node inference(eager and cuda graph) (#3686)
|
2024-03-28 15:01:55 -07:00 |
|
youkaichao
|
8f44facddd
|
[Core] remove cupy dependency (#3625)
|
2024-03-27 00:33:26 -07:00 |
|
SangBin Cho
|
01bfb22b41
|
[CI] Try introducing isort. (#3495)
|
2024-03-25 07:59:47 -07:00 |
|
Hanzhi Zhou
|
380170038e
|
Implement custom all reduce kernels (#2192)
|
2024-01-27 12:46:35 -08:00 |
|
Zhuohan Li
|
ef9b636e2d
|
Simplify broadcast logic for control messages (#2501)
|
2024-01-19 11:23:30 -08:00 |
|
Simon Mo
|
6e01e8c1c8
|
[CI] Add Buildkite (#2355)
|
2024-01-14 12:37:58 -08:00 |
|
Zhuohan Li
|
358c328d69
|
[BUGFIX] Fix communication test (#2285)
|
2023-12-27 17:18:11 -05:00 |
|
Zhuohan Li
|
20d0699d49
|
[Fix] Fix comm test (#1691)
|
2023-11-16 16:28:39 -08:00 |
|
Zhuohan Li
|
ba0bfd40e2
|
TP/quantization/weight loading refactor part 1 - Simplify parallel linear logic (#1181)
|
2023-10-02 15:36:09 -07:00 |
|