Kuntai Du
81ede99ca4
[Core] Deprecating block manager v1 and make block manager v2 default ( #8704 )
...
Removing the block manager v1. This is the initial piece of prefix-caching-centric design. In order to achieve prefix-caching-centric design, we need to simplify the code path so that we only use v2 block manager (which has much higher performance on prefix caching).
2024-10-17 11:38:15 -05:00
Junhao Li
5b8a1fde84
[Model][Bugfix] Add FATReLU activation and support for openbmb/MiniCPM-S-1B-sft ( #9396 )
2024-10-16 16:40:24 +00:00
Roger Wang
59230ef32b
[Misc] Consolidate example usage of OpenAI client for multimodal models ( #9412 )
...
Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>
2024-10-16 11:20:51 +00:00
Cyrus Leung
7abba39ee6
[Model] VLM2Vec, the first multimodal embedding model in vLLM ( #9303 )
2024-10-16 14:31:00 +08:00
Michael Goin
8e836d982a
[Doc] Fix code formatting in spec_decode.rst ( #9348 )
2024-10-14 21:29:11 -07:00
Tyler Michael Smith
169b530607
[Bugfix] Clean up some cruft in mamba.py ( #9343 )
2024-10-15 00:24:25 +00:00
Reza Salehi
dfe43a2071
[Model] Molmo vLLM Integration ( #9016 )
...
Co-authored-by: sanghol <sanghol@allenai.org>
Co-authored-by: Roger Wang <136131678+ywang96@users.noreply.github.com>
Co-authored-by: Roger Wang <ywang@roblox.com>
2024-10-14 07:56:24 -07:00
Wallas Henrique
8baf85e4e9
[Doc] Compatibility matrix for mutual exclusive features ( #8512 )
...
Signed-off-by: Wallas Santos <wallashss@ibm.com>
2024-10-11 11:18:50 -07:00
sixgod
6cf1167c1a
[Model] Add GLM-4v support and meet vllm==0.6.2 ( #9242 )
2024-10-11 17:36:13 +00:00
Tyler Michael Smith
7342a7d7f8
[Model] Support Mamba ( #6484 )
2024-10-11 15:40:06 +00:00
Cyrus Leung
e808156f30
[Misc] Collect model support info in a single process per model ( #9233 )
2024-10-11 11:08:11 +00:00
whyiug
04de9057ab
[Model] support input image embedding for minicpmv ( #9237 )
2024-10-10 15:00:47 +00:00
Jiangtao Hu
dc4aea677a
[Doc] Fix VLM prompt placeholder sample bug ( #9170 )
2024-10-09 08:59:42 +00:00
Sayak Paul
1874c6a1b0
[Doc] Update vlm.rst to include an example on videos ( #9155 )
...
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
2024-10-08 18:12:29 +00:00
Cyrus Leung
151ef4efd2
[Model] Support NVLM-D and fix QK Norm in InternViT ( #9045 )
...
Co-authored-by: Roger Wang <ywang@roblox.com>
Co-authored-by: Isotr0py <mozf@mail2.sysu.edu.cn>
2024-10-07 11:55:12 +00:00
Cyrus Leung
b22b798471
[Model] PP support for embedding models and update docs ( #9090 )
...
Co-authored-by: Roger Wang <136131678+ywang96@users.noreply.github.com>
2024-10-06 16:35:27 +08:00
Cyrus Leung
f22619fe96
[Misc] Remove user-facing error for removed VLM args ( #9104 )
2024-10-06 01:33:52 -07:00
Roger Wang
26aa325f4f
[Core][VLM] Test registration for OOT multimodal models ( #8717 )
...
Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>
2024-10-04 10:38:25 -07:00
Cyrus Leung
0e36fd4909
[Misc] Move registry to its own file ( #9064 )
2024-10-04 10:01:37 +00:00
Murali Andoorveedu
0f6d7a9a34
[Models] Add remaining model PP support ( #7168 )
...
Signed-off-by: Muralidhar Andoorveedu <muralidhar.andoorveedu@centml.ai>
Signed-off-by: Murali Andoorveedu <muralidhar.andoorveedu@centml.ai>
Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>
2024-10-04 10:56:58 +08:00
Nick Hill
18c2e30c57
[Doc] Update Granite model docs ( #9025 )
2024-10-03 02:42:24 +00:00
Cyrus Leung
4f341bd4bf
[Doc] Update list of supported models ( #8987 )
2024-10-02 00:35:39 +08:00
whyiug
e01ab595d8
[Model] support input embeddings for qwen2vl ( #8856 )
2024-09-30 03:16:10 +00:00
Cyrus Leung
3b00b9c26c
[Core] renamePromptInputs
and inputs
( #8876 )
2024-09-26 20:35:15 -07:00
Roger Wang
4bb98f2190
[Misc] Update config loading for Qwen2-VL and remove Granite ( #8837 )
2024-09-26 07:45:30 -07:00
Roger Wang
e2c6e0a829
[Doc] Update doc for Transformers 4.45 ( #8817 )
2024-09-25 13:29:48 -07:00
Chen Zhang
770ec6024f
[Model] Add support for the multi-modal Llama 3.2 model ( #8811 )
...
Co-authored-by: simon-mo <xmo@berkeley.edu>
Co-authored-by: Chang Su <chang.s.su@oracle.com>
Co-authored-by: Simon Mo <simon.mo@hey.com>
Co-authored-by: Roger Wang <136131678+ywang96@users.noreply.github.com>
Co-authored-by: Roger Wang <ywang@roblox.com>
2024-09-25 13:29:32 -07:00
Simon Mo
4f1ba0844b
Revert "rename PromptInputs and inputs with backward compatibility ( #8760 ) ( #8810 )
2024-09-25 10:36:26 -07:00
Cyrus Leung
28e1299e60
rename PromptInputs and inputs with backward compatibility ( #8760 )
2024-09-25 09:36:47 -07:00
Simon Mo
3185fb0cca
Revert "[Core] Rename PromptInputs
to PromptType
, and inputs
to prompt
" ( #8750 )
2024-09-24 05:45:20 +00:00
litianjian
5b59532760
[Model][VLM] Add LLaVA-Onevision model support ( #8486 )
...
Co-authored-by: litianjian <litianjian@bytedance.com>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
Co-authored-by: Roger Wang <ywang@roblox.com>
Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>
2024-09-22 10:51:44 -07:00
Cyrus Leung
0057894ef7
[Core] Rename PromptInputs
and inputs
( #8673 )
2024-09-20 19:00:54 -07:00
Niklas Muennighoff
3b63de9353
[Model] Add OLMoE ( #7922 )
2024-09-20 09:31:41 -07:00
Jiaxin Shan
260d40b5ea
[Core] Support Lora lineage and base model metadata management ( #6315 )
2024-09-20 06:20:56 +00:00
Geun, Lim
e18749ff09
[Model] Support Solar Model ( #8386 )
...
Co-authored-by: Michael Goin <michael@neuralmagic.com>
2024-09-18 11:04:00 -06:00
ywfang
8a0cf1ddc3
[Model] support minicpm3 ( #8297 )
...
Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>
2024-09-14 14:50:26 +00:00
Cyrus Leung
a84e598e21
[CI/Build] Reorganize models tests ( #7820 )
2024-09-13 10:20:06 -07:00
Alex Brooks
c6202daeed
[Model] Support multiple images for qwen-vl ( #8247 )
...
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>
2024-09-12 10:10:54 -07:00
Patrick von Platen
d394787e52
Pixtral ( #8377 )
...
Co-authored-by: Roger Wang <ywang@roblox.com>
2024-09-11 14:41:55 -07:00
Yang Fan
3b7fea770f
[Model][VLM] Add Qwen2-VL model support ( #7905 )
...
Co-authored-by: Roger Wang <136131678+ywang96@users.noreply.github.com>
Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>
2024-09-11 09:31:19 -07:00
Yangshen⚡Deng
6a512a00df
[model] Support for Llava-Next-Video model ( #7559 )
...
Co-authored-by: Roger Wang <ywang@roblox.com>
Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
2024-09-10 22:21:36 -07:00
Isotr0py
e807125936
[Model][VLM] Support multi-images inputs for InternVL2 models ( #8201 )
2024-09-07 16:38:23 +08:00
Cyrus Leung
2f707fcb35
[Model] Multi-input support for LLaVA ( #8238 )
2024-09-07 02:57:24 +00:00
Jiaxin Shan
db3bf7c991
[Core] Support load and unload LoRA in api server ( #6566 )
...
Co-authored-by: Jee Jee Li <pandaleefree@gmail.com>
2024-09-05 18:10:33 -07:00
sroy745
2febcf2777
[Documentation][Spec Decode] Add documentation about lossless guarantees in Speculative Decoding in vLLM ( #7962 )
2024-09-05 16:25:29 -04:00
Alex Brooks
9da25a88aa
[MODEL] Qwen Multimodal Support (Qwen-VL / Qwen-VL-Chat) ( #8029 )
...
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com>
Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>
2024-09-05 12:48:10 +00:00
Cyrus Leung
288a938872
[Doc] Indicate more information about supported modalities ( #8181 )
2024-09-05 10:51:53 +00:00
Wenxiang
1248e8506a
[Model] Adding support for MSFT Phi-3.5-MoE ( #7729 )
...
Co-authored-by: Your Name <you@example.com>
Co-authored-by: Zeqi Lin <zelin@microsoft.com>
Co-authored-by: Zeqi Lin <Zeqi.Lin@microsoft.com>
2024-08-30 13:42:57 -06:00
Yohan Na
dc13e99348
[MODEL] add Exaone model support ( #7819 )
2024-08-29 23:34:20 -07:00
Peter Salas
57792ed469
[Doc] Fix incorrect docs from #7615 ( #7788 )
2024-08-22 10:02:06 -07:00