20231088/vllm - vllm - Luminance Code Repo

20231088/vllm

Author	SHA1	Message	Date
Roger Wang	6206dcb29e	[Model] Add PaliGemma (#5189 ) Co-authored-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>	2024-07-07 09:25:50 +08:00
Cyrus Leung	9389380015	[Doc] Move guide for multimodal model and other improvements (#6168 )	2024-07-06 17:18:59 +08:00
Roger Wang	175c43eca4	[Doc] Reorganize Supported Models by Type (#6167 )	2024-07-06 05:59:36 +00:00
Cyrus Leung	ae96ef8fbd	[VLM] Calculate maximum number of multi-modal tokens by model (#6121 )	2024-07-04 16:37:23 -07:00
xwjiang2010	d9e98f42e4	[vlm] Remove vision language config. (#6089 ) Signed-off-by: Xiaowei Jiang <xwjiang2010@gmail.com> Co-authored-by: Roger Wang <ywang@roblox.com>	2024-07-03 22:14:16 +00:00
Cyrus Leung	9831aec49f	[Core] Dynamic image size support for VLMs (#5276 ) Signed-off-by: Xiaowei Jiang <xwjiang2010@gmail.com> Co-authored-by: Xiaowei Jiang <xwjiang2010@gmail.com> Co-authored-by: ywang96 <ywang@roblox.com> Co-authored-by: xwjiang2010 <87673679+xwjiang2010@users.noreply.github.com> Co-authored-by: Roger Wang <136131678+ywang96@users.noreply.github.com>	2024-07-02 20:34:00 -07:00
Mor Zusman	9d6a8daa87	[Model] Jamba support (#4115 ) Signed-off-by: Muralidhar Andoorveedu <muralidhar.andoorveedu@centml.ai> Co-authored-by: Erez Schwartz <erezs@ai21.com> Co-authored-by: Mor Zusman <morz@ai21.com> Co-authored-by: tomeras91 <57313761+tomeras91@users.noreply.github.com> Co-authored-by: Tomer Asida <tomera@ai21.com> Co-authored-by: Zhuohan Li <zhuohan123@gmail.com> Co-authored-by: Muralidhar Andoorveedu <muralidhar.andoorveedu@centml.ai>	2024-07-02 23:11:29 +00:00
xwjiang2010	98d6682cd1	[VLM] Remove `image_input_type` from VLM config (#5852 ) Signed-off-by: Xiaowei Jiang <xwjiang2010@gmail.com> Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com> Co-authored-by: Roger Wang <ywang@roblox.com>	2024-07-02 07:57:09 +00:00
Cyrus Leung	5cbe8d155c	[Core] Registry for processing model inputs (#5214 ) Co-authored-by: ywang96 <ywang@roblox.com>	2024-06-28 12:09:56 +00:00
Woosuk Kwon	79c92c7c8a	[Model] Add Gemma 2 (#5908 )	2024-06-27 13:33:56 -07:00
Cyrus Leung	96354d6a29	[Model] Add base class for LoRA-supported models (#5018 )	2024-06-27 16:03:04 +08:00
Roger Wang	3aa7b6cf66	[Misc][Doc] Add Example of using OpenAI Server with VLM (#5832 )	2024-06-25 20:34:25 -07:00
Cyrus Leung	f23871e9ee	[Doc] Add notice about breaking changes to VLMs (#5818 )	2024-06-25 01:25:03 -07:00
Michael Goin	1744cc99ba	[Doc] Add Phi-3-medium to list of supported models (#5788 )	2024-06-24 10:48:55 -07:00
Isotr0py	daef218b55	[Model] Initialize Phi-3-vision support (#4986 )	2024-06-17 19:34:33 -07:00
Cyrus Leung	0ce7b952f8	[Doc] Update LLaVA docs (#5437 ) Co-authored-by: Roger Wang <ywang@roblox.com>	2024-06-13 11:22:07 -07:00
Cade Daniel	89ec06c33b	[Docs] [Spec decode] Fix docs error in code example (#5427 )	2024-06-11 10:31:56 -07:00
Cade Daniel	4c2ffb28ff	[Speculative decoding] Initial spec decode docs (#5400 )	2024-06-11 10:15:40 -07:00
SangBin Cho	246598a6b1	[CI] docfix (#5410 ) Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk> Co-authored-by: ywang96 <ywang@roblox.com>	2024-06-11 01:28:50 -07:00
Roger Wang	856c990041	[Docs] Add Docs on Limitations of VLM Support (#5383 )	2024-06-10 09:53:50 -07:00
Cyrus Leung	6b29d6fe70	[Model] Initial support for LLaVA-NeXT (#4199 ) Co-authored-by: Roger Wang <ywang@roblox.com>	2024-06-10 12:47:15 +00:00
Roger Wang	7a9cb294ae	[Frontend] Add OpenAI Vision API Support (#5237 ) Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>	2024-06-07 11:23:32 -07:00
Cyrus Leung	7a64d24aad	[Core] Support image processor (#4197 )	2024-06-02 22:56:41 -07:00
Nick Hill	657579113f	[Doc] Add checkmark for GPTBigCodeForCausalLM LoRA support (#5171 )	2024-05-31 17:20:19 -07:00
Eric Xihui Lin	8e192ff967	[Kernel][Backend][Model] Blocksparse flash attention kernel and Phi-3-Small model (#4799 ) Co-authored-by: beagleski <yunanzhang@microsoft.com> Co-authored-by: bapatra <bapatra@microsoft.com> Co-authored-by: Barun Patra <codedecde@users.noreply.github.com> Co-authored-by: Michael Goin <michael@neuralmagic.com>	2024-05-24 22:00:52 -07:00
Isotr0py	f12c3b5b3d	[Model] Add Phi-2 LoRA support (#4886 )	2024-05-21 14:24:17 +09:00
Zhuohan Li	ac1fbf7fd2	[Doc] Shorten README by removing supported model list (#4796 )	2024-05-13 16:23:54 -07:00
SangBin Cho	e7c46b9527	[Scheduler] Warning upon preemption and Swapping (#4647 ) Co-authored-by: Robert Shaw <114415538+robertgshaw2-neuralmagic@users.noreply.github.com>	2024-05-13 23:50:44 +09:00
Simon Mo	51d4094fda	chunked-prefill-doc-syntax (#4603 ) Fix the docs: https://docs.vllm.ai/en/latest/models/performance.html Co-authored-by: sang <rkooo567@gmail.com>	2024-05-10 14:13:23 +09:00
SangBin Cho	36fb68f947	[Doc] Chunked Prefill Documentation (#4580 )	2024-05-04 00:18:00 -07:00
Isotr0py	fbf152d976	[Bugfix][Model] Refactor OLMo model to support new HF format in transformers 4.40.0 (#4324 ) Co-authored-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>	2024-04-25 09:35:56 -07:00
Caio Mendes	96e90fdeb3	[Model] Adds Phi-3 support (#4298 )	2024-04-25 03:06:57 +00:00
xiaoji	7f2593b164	[Doc]: Update the doc of adding new models (#4236 )	2024-04-21 09:57:08 -07:00
Harry Mellor	fe7d648fe5	Don't show default value for flags in `EngineArgs` (#4223 ) Co-authored-by: Harry Mellor <hmellor@oxts.com>	2024-04-21 09:15:28 -07:00
Harry Mellor	682789d402	Fix missing docs and out of sync `EngineArgs` (#4219 ) Co-authored-by: Harry Mellor <hmellor@oxts.com>	2024-04-19 20:51:33 -07:00
Simon Mo	705578ae14	[Docs] document that Meta Llama 3 is supported (#4175 )	2024-04-18 10:55:48 -07:00
Sanger Steel	d619ae2d19	[Doc] Add better clarity for tensorizer usage (#4090 ) Co-authored-by: Roger Wang <136131678+ywang96@users.noreply.github.com>	2024-04-15 13:28:25 -07:00
Simon Mo	aceb17cf2d	[Docs] document that mixtral 8x22b is supported (#4073 )	2024-04-14 14:35:55 -07:00
Sanger Steel	711a000255	[Frontend] [Core] feat: Add model loading using `tensorizer` (#3476 )	2024-04-13 17:13:01 -07:00
youkaichao	e35397468f	[Doc] Add doc to state our model support policy (#3948 ) Co-authored-by: Roger Wang <136131678+ywang96@users.noreply.github.com>	2024-04-10 17:03:02 +00:00
ywfang	b4543c8f6b	[Model] add minicpm (#3893 )	2024-04-08 18:28:36 +08:00
youkaichao	95baec828f	[Core] enable out-of-tree model register (#3871 )	2024-04-06 17:11:41 -07:00
Sean Gallen	78107fa091	[Doc]Add asynchronous engine arguments to documentation. (#3810 ) Co-authored-by: Simon Mo <simon.mo@hey.com> Co-authored-by: Roger Wang <136131678+ywang96@users.noreply.github.com>	2024-04-04 21:52:01 -07:00
wenyujin333	d6ea427f04	[Model] Add support for Qwen2MoeModel (#3346 )	2024-03-28 15:19:59 +00:00
Woosuk Kwon	6d9aa00fc4	[Docs] Add Command-R to supported models (#3669 )	2024-03-27 15:20:00 -07:00
Megha Agarwal	e24336b5a7	[Model] Add support for DBRX (#3660 )	2024-03-27 13:01:46 -07:00
Woosuk Kwon	e66b629c04	[Misc] Minor fix in KVCache type (#3652 )	2024-03-26 23:14:06 -07:00
Jee Li	76879342a3	[Doc]add lora support (#3649 )	2024-03-27 02:06:46 +00:00
Lalit Pradhan	4c07dd28c0	[🚀 Ready to be merged] Added support for Jais models (#3183 )	2024-03-21 09:45:24 +00:00
Simon Mo	ef65dcfa6f	[Doc] Add docs about OpenAI compatible server (#3288 )	2024-03-18 22:05:34 -07:00

... 3 4 5 6 7

302 Commits