20231088/vllm - vllm - Luminance Code Repo

20231088/vllm

Author	SHA1	Message	Date
Tyler Michael Smith	28b3a1c7e5	[V1] Multiprocessing Tensor Parallel Support for v1 (#9856 ) Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>	2024-12-10 06:28:14 +00:00
youkaichao	1b62745b1d	[core][executor] simplify instance id (#10976 ) Signed-off-by: youkaichao <youkaichao@gmail.com>	2024-12-07 09:33:45 -08:00
youkaichao	a111d0151f	[platforms] absorb worker cls difference into platforms folder (#10555 ) Signed-off-by: youkaichao <youkaichao@gmail.com> Co-authored-by: Nick Hill <nhill@redhat.com>	2024-11-21 21:00:32 -08:00
Mengqing Cao	7371749d54	[Misc] Fix ImportError causing by triton (#9493 )	2024-11-08 05:08:51 +00:00
Russell Bryant	d1537039ce	[Core] Improve choice of Python multiprocessing method (#8823 ) Signed-off-by: Russell Bryant <rbryant@redhat.com> Co-authored-by: youkaichao <youkaichao@126.com>	2024-09-29 09:17:07 +08:00
Nick Hill	acd5511b6d	[BugFix] Fix clean shutdown issues (#8492 )	2024-09-16 09:33:46 -07:00
afeldman-nm	428dd1445e	[Core] Logprobs support in Multi-step (#7652 )	2024-08-29 19:19:08 -07:00
youkaichao	f52a43a8b9	[ci][test] fix pp test failure (#7945 )	2024-08-28 01:27:07 -07:00
Kunshang Ji	076169f603	[Hardware][Intel GPU] Add intel GPU pipeline parallel support. (#7810 )	2024-08-27 10:07:02 -07:00
youkaichao	660dea1235	[cuda][misc] remove error_on_invalid_device_count_status (#7069 )	2024-08-02 00:14:21 -07:00
Travis Johnson	593e79e733	[Bugfix] torch.set_num_threads() in multiproc_gpu_executor (#6802 ) [Bugfix] Use torch.set_num_threads() to configure parallelism in multiproc_gpu_executor (#6802) Signed-off-by: Travis Johnson <tsjohnso@us.ibm.com>	2024-07-26 22:15:20 -07:00
Anthony Platanios	084a01fd35	[Bugfix] [Easy] Fixed a bug in the multiprocessing GPU executor. (#6770 )	2024-07-25 21:25:35 -07:00
Antoni Baum	7bd82002ae	[Core] Allow specifying custom Executor (#6557 )	2024-07-20 01:25:06 +00:00
Nick Hill	b5672a112c	[Core] Multiprocessing Pipeline Parallel support (#6130 ) Co-authored-by: Murali Andoorveedu <muralidhar.andoorveedu@centml.ai>	2024-07-18 19:15:52 -07:00
youkaichao	09c2eb85dd	[ci][distributed] add pipeline parallel correctness test (#6410 )	2024-07-16 15:44:22 -07:00
Thomas Parnell	eaec4b9153	[Bugfix] Add custom Triton cache manager to resolve MoE MP issue (#6140 ) Signed-off-by: Thomas Parnell <tpa@zurich.ibm.com> Co-authored-by: Chih-Chieh-Yang <chih.chieh.yang@ibm.com>	2024-07-15 10:12:47 -07:00
Travis Johnson	1dab9bc8a9	[Bugfix] set OMP_NUM_THREADS to 1 by default for multiprocessing (#6109 ) Signed-off-by: Travis Johnson <tsjohnso@us.ibm.com> Co-authored-by: Nick Hill <nickhill@us.ibm.com>	2024-07-03 16:56:59 -07:00
youkaichao	f666207161	[misc][distributed] error on invalid state (#6092 )	2024-07-02 23:37:29 -07:00
Murali Andoorveedu	c5832d2ae9	[Core] Pipeline Parallel Support (#4412 ) Signed-off-by: Muralidhar Andoorveedu <muralidhar.andoorveedu@centml.ai>	2024-07-02 10:58:08 -07:00
Stephanie Wang	dda4811591	[Core] Refactor Worker and ModelRunner to consolidate control plane communication (#5408 ) Signed-off-by: Stephanie Wang <swang@cs.berkeley.edu> Signed-off-by: Stephanie <swang@anyscale.com> Co-authored-by: Stephanie <swang@anyscale.com>	2024-06-25 20:30:03 -07:00
Matt Wong	dd793d1de5	[Hardware][AMD][CI/Build][Doc] Upgrade to ROCm 6.1, Dockerfile improvements, test fixes (#5422 )	2024-06-25 15:56:15 -07:00
youkaichao	3eea74889f	[misc][distributed] use 127.0.0.1 for single-node (#5619 )	2024-06-19 08:05:00 +00:00
Antoni Baum	50eed24d25	Add `cuda_device_count_stateless` (#5473 )	2024-06-13 16:06:49 -07:00
Nick Hill	99dac099ab	[Core][Doc] Default to multiprocessing for single-node distributed case (#5230 ) Co-authored-by: Antoni Baum <antoni.baum@protonmail.com>	2024-06-11 11:10:41 -07:00
Junichi Sato	2e02311a1b	[Bugfix] Fix `MultiprocessingGPUExecutor.check_health` when world_size == 1 (#5254 )	2024-06-11 10:38:07 -07:00
zifeitong	a58f24e590	[Bugfix] Fix torch.compile() error when using MultiprocessingGPUExecutor (#5229 )	2024-06-03 20:55:50 -07:00
Nick Hill	eb6d3c264d	[Core] Eliminate parallel worker per-step task scheduling overhead (#4894 )	2024-05-23 06:17:27 +09:00
Nick Hill	676a99982f	[Core] Add MultiprocessingGPUExecutor (#4539 ) Co-authored-by: SAHIL SUNEJA <suneja@us.ibm.com>	2024-05-14 10:38:59 -07:00

28 Commits