Guillaume Calmettes
36c513a076
[BugFix] Do not raise a ValueError
when tool_choice
is set to the supported none
option and tools
are not defined. ( #10000 )
...
Signed-off-by: Guillaume Calmettes <gcalmettes@scaleway.com>
2024-11-12 11:13:46 +00:00
youkaichao
3a28f18b0b
[doc] explain the class hierarchy in vLLM ( #10240 )
...
Signed-off-by: youkaichao <youkaichao@gmail.com>
2024-11-11 22:56:44 -08:00
youkaichao
d1c6799b88
[doc] update debugging guide ( #10236 )
...
Signed-off-by: youkaichao <youkaichao@gmail.com>
2024-11-11 15:21:12 -08:00
Yuan Tang
4800339c62
Add docs on serving with Llama Stack ( #10183 )
...
Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>
Co-authored-by: Russell Bryant <rbryant@redhat.com>
2024-11-11 11:28:55 -08:00
youkaichao
f0f2e5638e
[doc] improve debugging code ( #10206 )
...
Signed-off-by: youkaichao <youkaichao@gmail.com>
2024-11-10 17:49:40 -08:00
Shawn Du
20cf2f553c
[Misc] small fixes to function tracing file path ( #9543 )
...
Signed-off-by: Shawn Du <shawnd200@outlook.com>
Signed-off-by: youkaichao <youkaichao@gmail.com>
Co-authored-by: youkaichao <youkaichao@gmail.com>
2024-11-10 15:21:06 -08:00
Yongzao
bfb7d61a7c
[doc] Polish the integration with huggingface doc ( #10195 )
...
Signed-off-by: youkaichao <youkaichao@gmail.com>
Co-authored-by: youkaichao <youkaichao@gmail.com>
2024-11-10 10:22:04 -08:00
youkaichao
9fa4bdde9d
[ci][build] limit cmake version ( #10188 )
...
Signed-off-by: youkaichao <youkaichao@gmail.com>
2024-11-09 16:27:26 -08:00
cjackal
d88bff1b96
[Frontend] add add_request_id
middleware ( #9594 )
...
Signed-off-by: cjackal <44624812+cjackal@users.noreply.github.com>
2024-11-09 10:18:29 +00:00
youkaichao
8a4358ecb5
[doc] explaining the integration with huggingface ( #10173 )
...
Signed-off-by: youkaichao <youkaichao@gmail.com>
2024-11-09 01:02:54 -08:00
Cyrus Leung
49d2a41a86
[Doc] Adjust RunLLM location ( #10176 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2024-11-08 20:07:10 -08:00
Cyrus Leung
e0191a95d8
[0/N] Rename MultiModalInputs
to MultiModalKwargs
( #10040 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2024-11-09 11:31:02 +08:00
Rafael Vasquez
6b30471586
[Misc] Improve Web UI ( #10090 )
...
Signed-off-by: Rafael Vasquez <rafvasq21@gmail.com>
2024-11-08 09:51:04 -08:00
Russell Bryant
3a7f15a398
[Doc] Move CONTRIBUTING to docs site ( #9924 )
...
Signed-off-by: Russell Bryant <rbryant@redhat.com>
2024-11-08 05:15:12 +00:00
whyiug
40d0e7411d
[Doc] Update FAQ links in spec_decode.rst ( #9662 )
...
Signed-off-by: whyiug <whyiug@hotmail.com>
2024-11-08 04:44:58 +00:00
litianjian
28b2877d30
Online video support for VLMs ( #10020 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Co-authored-by: litianjian <litianjian@bytedance.com>
Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>
2024-11-07 20:25:59 +00:00
Maximilien de Bayser
ae62fd17c0
[Frontend] Tool calling parser for Granite 3.0 models ( #9027 )
...
Signed-off-by: Max de Bayser <mbayser@br.ibm.com>
2024-11-07 07:09:02 -08:00
Rafael Vasquez
d7263a1bb8
Doc: Improve benchmark documentation ( #9927 )
...
Signed-off-by: Rafael Vasquez <rafvasq21@gmail.com>
2024-11-06 23:50:35 -08:00
Cyrus Leung
db7db4aab9
[Misc] Consolidate ModelConfig code related to HF config ( #10104 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2024-11-07 06:00:21 +00:00
youkaichao
e7b84c394d
[doc] add back Python 3.8 ABI ( #10100 )
...
Signed-off-by: youkaichao <youkaichao@gmail.com>
2024-11-06 21:06:41 -08:00
Li, Jiang
a4b3e0c1e9
[Hardware][CPU] Update torch 2.5 ( #9911 )
...
Signed-off-by: jiang1.li <jiang1.li@intel.com>
2024-11-07 04:43:08 +00:00
Russell Bryant
098f94de42
[CI/Build] Drop Python 3.8 support ( #10038 )
...
Signed-off-by: Russell Bryant <rbryant@redhat.com>
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>
2024-11-06 14:31:01 +00:00
Eric
406d4cc480
[Model][LoRA]LoRA support added for Qwen2VLForConditionalGeneration ( #10022 )
...
Signed-off-by: ericperfect <ericperfectttt@gmail.com>
2024-11-06 14:13:15 +00:00
Jee Jee Li
a5bba7d234
[Model] Add Idefics3 support ( #9767 )
...
Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>
Signed-off-by: B-201 <Joy25810@foxmail.com>
Co-authored-by: B-201 <Joy25810@foxmail.com>
2024-11-06 11:41:17 +00:00
Jee Jee Li
2003cc3513
[Model][LoRA]LoRA support added for LlamaEmbeddingModel ( #10071 )
...
Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>
2024-11-06 09:49:19 +00:00
Konrad Zawora
a02a50e6e5
[Hardware][Intel-Gaudi] Add Intel Gaudi (HPU) inference backend ( #6143 )
...
Signed-off-by: yuwenzho <yuwen.zhou@intel.com>
Signed-off-by: Chendi.Xue <chendi.xue@intel.com>
Signed-off-by: Bob Zhu <bob.zhu@intel.com>
Signed-off-by: zehao-intel <zehao.huang@intel.com>
Signed-off-by: Konrad Zawora <kzawora@habana.ai>
Co-authored-by: Kunshang Ji <kunshang.ji@intel.com>
Co-authored-by: Sanju C Sudhakaran <scsudhakaran@habana.ai>
Co-authored-by: Michal Adamczyk <madamczyk@habana.ai>
Co-authored-by: Marceli Fylcek <mfylcek@habana.ai>
Co-authored-by: Himangshu Lahkar <49579433+hlahkar@users.noreply.github.com>
Co-authored-by: Vivek Goel <vgoel@habana.ai>
Co-authored-by: yuwenzho <yuwen.zhou@intel.com>
Co-authored-by: Dominika Olszewska <dolszewska@habana.ai>
Co-authored-by: barak goldberg <149692267+bgoldberg-habana@users.noreply.github.com>
Co-authored-by: Michal Szutenberg <37601244+szutenberg@users.noreply.github.com>
Co-authored-by: Jan Kaniecki <jkaniecki@habana.ai>
Co-authored-by: Agata Dobrzyniewicz <160237065+adobrzyniewicz-habana@users.noreply.github.com>
Co-authored-by: Krzysztof Wisniewski <kwisniewski@habana.ai>
Co-authored-by: Dudi Lester <160421192+dudilester@users.noreply.github.com>
Co-authored-by: Ilia Taraban <tarabanil@gmail.com>
Co-authored-by: Chendi.Xue <chendi.xue@intel.com>
Co-authored-by: Michał Kuligowski <mkuligowski@habana.ai>
Co-authored-by: Jakub Maksymczuk <jmaksymczuk@habana.ai>
Co-authored-by: Tomasz Zielinski <85164140+tzielinski-habana@users.noreply.github.com>
Co-authored-by: Sun Choi <schoi@habana.ai>
Co-authored-by: Iryna Boiko <iboiko@habana.ai>
Co-authored-by: Bob Zhu <41610754+czhu15@users.noreply.github.com>
Co-authored-by: hlin99 <73271530+hlin99@users.noreply.github.com>
Co-authored-by: Zehao Huang <zehao.huang@intel.com>
Co-authored-by: Andrzej Kotłowski <Andrzej.Kotlowski@intel.com>
Co-authored-by: Yan Tomsinsky <73292515+Yantom1@users.noreply.github.com>
Co-authored-by: Nir David <ndavid@habana.ai>
Co-authored-by: Yu-Zhou <yu.zhou@intel.com>
Co-authored-by: Ruheena Suhani Shaik <rsshaik@habana.ai>
Co-authored-by: Karol Damaszke <kdamaszke@habana.ai>
Co-authored-by: Marcin Swiniarski <mswiniarski@habana.ai>
Co-authored-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
Co-authored-by: Jacek Czaja <jacek.czaja@intel.com>
Co-authored-by: Jacek Czaja <jczaja@habana.ai>
Co-authored-by: Yuan <yuan.zhou@outlook.com>
2024-11-06 01:09:10 -08:00
Aaron Pham
21063c11c7
[CI/Build] drop support for Python 3.8 EOL ( #8464 )
...
Signed-off-by: Aaron Pham <contact@aarnphm.xyz>
2024-11-06 07:11:55 +00:00
Richard Liu
cd34029e91
Refactor TPU requirements file and pin build dependencies ( #10010 )
...
Signed-off-by: Richard Liu <ricliu@google.com>
2024-11-05 16:48:44 +00:00
Roger Wang
6e056bcf04
[Doc] Update VLM doc about loading from local files ( #9999 )
...
Signed-off-by: Roger Wang <ywang@roblox.com>
2024-11-04 19:47:11 +00:00
shanshan wang
54597724f4
[Model] Add support for H2OVL-Mississippi models ( #9747 )
...
Signed-off-by: Shanshan Wang <shanshan.wang@h2o.ai>
Signed-off-by: Roger Wang <ywang@roblox.com>
Co-authored-by: Roger Wang <ywang@roblox.com>
2024-11-04 00:15:36 +00:00
Michael Green
1d4cfe2be1
[Doc] Updated tpu-installation.rst with more details ( #9926 )
...
Signed-off-by: Michael Green <mikegre@google.com>
2024-11-02 10:06:45 -04:00
Nick Hill
eed92f12fc
[Docs] Update Granite 3.0 models in supported models table ( #9930 )
...
Signed-off-by: Nick Hill <nhill@redhat.com>
Signed-off-by: Nick Hill <nickhill@us.ibm.com>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
2024-11-02 09:02:18 +00:00
Cyrus Leung
ba0d892074
[Frontend] Use a proper chat template for VLM2Vec ( #9912 )
2024-11-01 14:09:07 +00:00
Cyrus Leung
06386a64dd
[Frontend] Chat-based Embeddings API ( #9759 )
2024-11-01 08:13:35 +00:00
Cyrus Leung
d3aa2a8b2f
[Doc] Update multi-input support ( #9906 )
2024-11-01 07:34:49 +00:00
Yongzao
2b5bf20988
[torch.compile] Adding torch compile annotations to some models ( #9876 )
...
Signed-off-by: youkaichao <youkaichao@gmail.com>
Co-authored-by: youkaichao <youkaichao@gmail.com>
2024-11-01 00:25:47 -07:00
Joe Runde
031a7995f3
[Bugfix][Frontend] Reject guided decoding in multistep mode ( #9892 )
...
Signed-off-by: Joe Runde <Joseph.Runde@ibm.com>
2024-11-01 01:09:46 +00:00
Jee Jee Li
5608e611c2
[Doc] Update Qwen documentation ( #9869 )
2024-10-31 08:54:18 +00:00
Guillaume Calmettes
abbfb6134d
[Misc][OpenAI] deprecate max_tokens in favor of new max_completion_tokens field for chat completion endpoint ( #9837 )
2024-10-30 18:15:56 -07:00
youkaichao
c2cd1a2142
[doc] update pp support ( #9853 )
...
Signed-off-by: youkaichao <youkaichao@gmail.com>
2024-10-30 13:36:51 -07:00
Joe Runde
33d257735f
[Doc] link bug for multistep guided decoding ( #9843 )
...
Signed-off-by: Joe Runde <Joseph.Runde@ibm.com>
2024-10-30 17:28:29 +00:00
Woosuk Kwon
211fe91aa8
[TPU] Correctly profile peak memory usage & Upgrade PyTorch XLA ( #9438 )
2024-10-30 09:41:38 +00:00
Yan Ma
04a3ae0aca
[Bugfix] Fix multi nodes TP+PP for XPU ( #8884 )
...
Signed-off-by: YiSheng5 <syhm@mail.ustc.edu.cn>
Signed-off-by: yan ma <yan.ma@intel.com>
Co-authored-by: YiSheng5 <syhm@mail.ustc.edu.cn>
2024-10-29 21:34:45 -07:00
Will Eaton
882a1ad0de
[Model] tool calling support for ibm-granite/granite-20b-functioncalling ( #8339 )
...
Signed-off-by: Max de Bayser <mbayser@br.ibm.com>
Co-authored-by: Max de Bayser <mbayser@br.ibm.com>
Co-authored-by: Maximilien de Bayser <maxdebayser@gmail.com>
2024-10-29 15:07:37 -07:00
Russell Bryant
c5d7fb9ddc
[Doc] fix third-party model example ( #9771 )
...
Signed-off-by: Russell Bryant <rbryant@redhat.com>
2024-10-28 19:39:21 -07:00
kakao-kevin-us
6650e6a930
[Model] Add classification Task with Qwen2ForSequenceClassification ( #9704 )
...
Signed-off-by: Kevin-Yang <ykcha9@gmail.com>
Co-authored-by: Kevin-Yang <ykcha9@gmail.com>
2024-10-26 17:53:35 +00:00
Rafael Vasquez
228cfbd03f
[Doc] Improve quickstart documentation ( #9256 )
...
Signed-off-by: Rafael Vasquez <rafvasq21@gmail.com>
2024-10-25 14:32:10 -07:00
Cyrus Leung
b979143d5b
[Doc] Move additional tips/notes to the top ( #9647 )
2024-10-24 09:43:59 +00:00
Yongzao
8a02cd045a
[torch.compile] Adding torch compile annotations to some models ( #9639 )
...
Signed-off-by: youkaichao <youkaichao@gmail.com>
Co-authored-by: youkaichao <youkaichao@gmail.com>
2024-10-24 00:54:57 -07:00
Cyrus Leung
836e8ef6ee
[Bugfix] Fix PP for ChatGLM and Molmo ( #9422 )
2024-10-24 06:12:05 +00:00