20231088/vllm - vllm - Luminance Code Repo

20231088/vllm

Author	SHA1	Message	Date
Ilya Lavrenov	57f09a419c	[Hardware][Intel] OpenVINO vLLM backend (#5379 )	2024-06-28 13:50:16 +00:00
Cyrus Leung	5cbe8d155c	[Core] Registry for processing model inputs (#5214 ) Co-authored-by: ywang96 <ywang@roblox.com>	2024-06-28 12:09:56 +00:00
Woosuk Kwon	79c92c7c8a	[Model] Add Gemma 2 (#5908 )	2024-06-27 13:33:56 -07:00
youkaichao	3fd02bda51	[doc][misc] add note for Kubernetes users (#5916 )	2024-06-27 10:07:07 -07:00
Cyrus Leung	96354d6a29	[Model] Add base class for LoRA-supported models (#5018 )	2024-06-27 16:03:04 +08:00
youkaichao	294104c3f9	[doc] update usage of env var to avoid conflict (#5873 )	2024-06-26 17:57:12 -04:00
Roger Wang	3aa7b6cf66	[Misc][Doc] Add Example of using OpenAI Server with VLM (#5832 )	2024-06-25 20:34:25 -07:00
Matt Wong	dd793d1de5	[Hardware][AMD][CI/Build][Doc] Upgrade to ROCm 6.1, Dockerfile improvements, test fixes (#5422 )	2024-06-25 15:56:15 -07:00
youkaichao	c18ebfdd71	[doc][distributed] add both gloo and nccl tests (#5834 )	2024-06-25 15:10:28 -04:00
Cyrus Leung	f23871e9ee	[Doc] Add notice about breaking changes to VLMs (#5818 )	2024-06-25 01:25:03 -07:00
Michael Goin	1744cc99ba	[Doc] Add Phi-3-medium to list of supported models (#5788 )	2024-06-24 10:48:55 -07:00
Michael Goin	e72dc6cb35	[Doc] Add "Suggest edit" button to doc pages (#5789 )	2024-06-24 10:26:17 -07:00
youkaichao	c246212952	[doc][faq] add warning to download models for every nodes (#5783 )	2024-06-24 15:37:42 +08:00
Woosuk Kwon	8c00f9c15d	[Docs][TPU] Add installation tip for TPU (#5761 )	2024-06-21 23:09:40 -07:00
Michael Goin	5b15bde539	[Doc] Documentation on supported hardware for quantization methods (#5745 )	2024-06-21 12:44:29 -04:00
Roger Wang	1b2eaac316	[Bugfix][Doc] FIx Duplicate Explicit Target Name Errors (#5703 )	2024-06-19 23:10:47 -07:00
Rafael Vasquez	e83db9e7e3	[Doc] Update docker references (#5614 ) Signed-off-by: Rafael Vasquez <rafvasq21@gmail.com>	2024-06-19 15:01:45 -07:00
milo157	2bd231a7b7	[Doc] Added cerebrium as Integration option (#5553 )	2024-06-18 15:56:59 -07:00
Isotr0py	daef218b55	[Model] Initialize Phi-3-vision support (#4986 )	2024-06-17 19:34:33 -07:00
Kunshang Ji	728c4c8a06	[Hardware][Intel GPU] Add Intel GPU(XPU) inference backend (#3814 ) Co-authored-by: Jiang Li <jiang1.li@intel.com> Co-authored-by: Abhilash Majumder <abhilash.majumder@intel.com> Co-authored-by: Abhilash Majumder <30946547+abhilash1910@users.noreply.github.com>	2024-06-17 11:01:25 -07:00
youkaichao	845a3f26f9	[Doc] add debugging tips for crash and multi-node debugging (#5581 )	2024-06-17 10:08:01 +08:00
Sanger Steel	6e2527a7cb	[Doc] Update documentation on Tensorizer (#5471 )	2024-06-14 11:27:57 -07:00
Simon Mo	cdab68dcdb	[Docs] Add ZhenFund as a Sponsor (#5548 )	2024-06-14 11:17:21 -07:00
Cyrus Leung	0ce7b952f8	[Doc] Update LLaVA docs (#5437 ) Co-authored-by: Roger Wang <ywang@roblox.com>	2024-06-13 11:22:07 -07:00
Woosuk Kwon	a65634d3ae	[Docs] Add 4th meetup slides (#5509 )	2024-06-13 10:18:26 -07:00
Li, Jiang	80aa7e91fc	[Hardware][Intel] Optimize CPU backend and add more performance tips (#4971 ) Co-authored-by: Jianan Gu <jianan.gu@intel.com>	2024-06-13 09:33:14 -07:00
Cyrus Leung	b8d4dfff9c	[Doc] Update debug docs (#5438 )	2024-06-12 14:49:31 -07:00
Woosuk Kwon	1a8bfd92d5	[Hardware] Initial TPU integration (#5292 )	2024-06-12 11:53:03 -07:00
youkaichao	8f89d72090	[Doc] add common case for long waiting time (#5430 )	2024-06-11 11:12:13 -07:00
Nick Hill	99dac099ab	[Core][Doc] Default to multiprocessing for single-node distributed case (#5230 ) Co-authored-by: Antoni Baum <antoni.baum@protonmail.com>	2024-06-11 11:10:41 -07:00
Cade Daniel	89ec06c33b	[Docs] [Spec decode] Fix docs error in code example (#5427 )	2024-06-11 10:31:56 -07:00
Kuntai Du	9fde251bf0	[Doc] Add an automatic prefix caching section in vllm documentation (#5324 ) Co-authored-by: simon-mo <simon.mo@hey.com>	2024-06-11 10:24:59 -07:00
Cade Daniel	4c2ffb28ff	[Speculative decoding] Initial spec decode docs (#5400 )	2024-06-11 10:15:40 -07:00
SangBin Cho	246598a6b1	[CI] docfix (#5410 ) Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk> Co-authored-by: ywang96 <ywang@roblox.com>	2024-06-11 01:28:50 -07:00
Roger Wang	3c4cebf751	[Doc][Typo] Fixing Missing Comma (#5403 )	2024-06-11 00:20:28 -07:00
youkaichao	d8f31f2f8b	[Doc] add debugging tips (#5409 )	2024-06-10 23:21:43 -07:00
Michael Goin	77c87beb06	[Doc] Add documentation for FP8 W8A8 (#5388 )	2024-06-10 18:55:12 -06:00
Woosuk Kwon	cb77ad836f	[Docs] Alphabetically sort sponsors (#5386 )	2024-06-10 15:17:19 -05:00
Roger Wang	856c990041	[Docs] Add Docs on Limitations of VLM Support (#5383 )	2024-06-10 09:53:50 -07:00
Cyrus Leung	6b29d6fe70	[Model] Initial support for LLaVA-NeXT (#4199 ) Co-authored-by: Roger Wang <ywang@roblox.com>	2024-06-10 12:47:15 +00:00
Roger Wang	7a9cb294ae	[Frontend] Add OpenAI Vision API Support (#5237 ) Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>	2024-06-07 11:23:32 -07:00
Simon Mo	f270a39537	[Docs] Add Sequoia as sponsors (#5287 )	2024-06-05 18:02:56 +00:00
Jie Fu (傅杰)	87d5abef75	[Bugfix] Fix a bug caused by pip install setuptools>=49.4.0 for CPU backend (#5249 )	2024-06-04 09:57:51 -07:00
Breno Faria	f775a07e30	[FRONTEND] OpenAI `tools` support named functions (#5032 )	2024-06-03 18:25:29 -05:00
Cyrus Leung	7a64d24aad	[Core] Support image processor (#4197 )	2024-06-02 22:56:41 -07:00
Nick Hill	657579113f	[Doc] Add checkmark for GPTBigCodeForCausalLM LoRA support (#5171 )	2024-05-31 17:20:19 -07:00
Chansung Park	429d89720e	add doc about serving option on dstack (#3074 ) Co-authored-by: Roger Wang <ywang@roblox.com>	2024-05-30 10:11:07 -07:00
Cyrus Leung	a9bcc7afb2	[Doc] Use intersphinx and update entrypoints docs (#5125 )	2024-05-30 09:59:23 -07:00
youkaichao	4fbcb0f27e	[Doc][Build] update after removing vllm-nccl (#5103 ) Co-authored-by: Roger Wang <136131678+ywang96@users.noreply.github.com>	2024-05-29 23:51:18 +00:00
Cyrus Leung	5ae5ed1e60	[Core] Consolidate prompt arguments to LLM engines (#4328 ) Co-authored-by: Roger Wang <ywang@roblox.com>	2024-05-28 13:29:31 -07:00

1 2 3 4 5

215 Commits