7 Commits

Author SHA1 Message Date
Cyrus Leung
5bf35a91e4
[Doc][CI/Build] Update docs and tests to use vllm serve (#6431) 2024-07-17 07:43:21 +00:00
youkaichao
94b82e8c18
[doc][distributed] add suggestion for distributed inference (#6418) 2024-07-15 09:45:51 -07:00
Murali Andoorveedu
673dd4cae9
[Docs] Docs update for Pipeline Parallel (#6222)
Signed-off-by: Muralidhar Andoorveedu <muralidhar.andoorveedu@centml.ai>
Co-authored-by: Simon Mo <simon.mo@hey.com>
2024-07-09 16:24:58 -07:00
youkaichao
4050d646e5
[doc][misc] remove deprecated api server in doc (#6037) 2024-07-01 12:52:43 -04:00
youkaichao
c246212952
[doc][faq] add warning to download models for every nodes (#5783) 2024-06-24 15:37:42 +08:00
Nick Hill
99dac099ab
[Core][Doc] Default to multiprocessing for single-node distributed case (#5230)
Co-authored-by: Antoni Baum <antoni.baum@protonmail.com>
2024-06-11 11:10:41 -07:00
Zhuohan Li
2cf1a333b6
[Doc] Documentation for distributed inference (#261) 2023-06-26 11:34:23 -07:00