814 Commits

Author SHA1 Message Date
Wei Zeng
30d6a015e0
[Feature] specify model in config.yaml (#15798)
Signed-off-by: weizeng <weizeng@roblox.com>
2025-04-01 01:20:06 -07:00
Harry Mellor
a76f547e11
Rename fallback model and refactor supported models section (#15829)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-03-31 22:49:41 -07:00
Harry Mellor
e6e3c55ef2
Move dockerfiles into their own directory (#14549)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-03-31 13:47:32 -07:00
shangmingc
239b7befdd
[V1][Spec Decode] Remove deprecated spec decode config params (#15466)
Signed-off-by: Shangming Cai <caishangming@linux.alibaba.com>
2025-03-31 09:19:35 -07:00
Harry Mellor
e5ef4fa99a
Upgrade transformers to v4.50.3 (#13905)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-03-31 08:59:37 -07:00
Naveassaf
3aa2b6a637
[Model] Update support for NemotronNAS models (#15008)
Signed-off-by: Nave Assaf <nassaf@nvidia.com>
2025-03-31 20:35:14 +08:00
Harry Mellor
b932c048ac
Recommend developing with Python 3.12 in developer guide (#15811)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
2025-03-31 11:54:49 +00:00
Reid
44c3a5abc3
[doc] update conda to usage link in installation (#15761)
Signed-off-by: reidliu41 <reid201711@gmail.com>
Co-authored-by: reidliu41 <reid201711@gmail.com>
2025-03-30 08:12:13 +00:00
Ce Gao
762b424a52
[Docs] Document v0 engine support in reasoning outputs (#15739)
Signed-off-by: Ce Gao <cegao@tensorchord.ai>
2025-03-29 03:46:57 +00:00
pengyuange
de1cb38769
[Model] Support Skywork-R1V (#15397)
Signed-off-by: jiacai.liu <932997367@qq.com>
Co-authored-by: jiacai.liu <932997367@qq.com>
2025-03-28 20:39:21 -07:00
Gregory Shtrasberg
c802f5430d
[ROCm][AMD][Build] Update AMD supported arch list (#15632)
Signed-off-by: Gregory Shtrasberg <Gregory.Shtrasberg@amd.com>
2025-03-28 20:39:18 -07:00
simpx
cff8991a50
[Docs][V1] Optimize diagrams in prefix caching design (#15716) 2025-03-29 03:33:58 +00:00
Reid
2914006fe0
[doc] add missing imports (#15699)
Signed-off-by: reidliu41 <reid201711@gmail.com>
Co-authored-by: reidliu41 <reid201711@gmail.com>
2025-03-28 15:56:48 +00:00
Harry Mellor
0b4167526d
[Docs] Add "Generation quality changed" section to troubleshooting (#15701)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-03-28 13:03:21 +00:00
Li, Jiang
280d074103
[CPU][CI] Improve CPU Dockerfile (#15690)
Signed-off-by: jiang1.li <jiang1.li@intel.com>
2025-03-28 01:36:31 -07:00
wwl2755
b4245a48df
[Doc] Fix dead links in Job Board (#15637)
Signed-off-by: wwl2755 <wangwenlong2755@gmail.com>
2025-03-28 02:43:40 +00:00
Kebe
4e0f6076be
[Bugfix] Fix failure to launch in Tensor Parallel TP mode on macOS. (#14948)
Signed-off-by: Kebe <mail@kebe7jun.com>
Signed-off-by: youkaichao <youkaichao@gmail.com>
Co-authored-by: youkaichao <youkaichao@gmail.com>
2025-03-28 10:13:41 +08:00
cnorman
32d669275b
Correct PowerPC to modern IBM Power (#15635)
Signed-off-by: Christy Norman <christy@linux.vnet.ibm.com>
2025-03-27 15:04:32 -07:00
Harry Mellor
46450b8d33
Use absolute placement for Ask AI button (#15628)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-03-27 18:52:18 +00:00
Cyrus Leung
07bf813fb5
[Doc] Link to onboarding tasks (#15629)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-03-27 16:30:53 +00:00
Cyrus Leung
ac5bc615b0
[Model] MiniCPM-V/O supports V1 (#15487)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-03-27 06:07:29 -07:00
Reid
8063dfc61a
[Doc] update --system for transformers installation in docker doc (#15616)
Signed-off-by: reidliu41 <reid201711@gmail.com>
Co-authored-by: reidliu41 <reid201711@gmail.com>
2025-03-27 20:38:46 +08:00
Wei Zeng
7f301dd8ef
[Doc] Update V1 user guide for fp8 kv cache support (#15585)
Signed-off-by: weizeng <weizeng@roblox.com>
2025-03-26 19:39:03 -07:00
marko
27df5199d9
Support SHA256 as hash function in prefix caching (#15297)
Signed-off-by: Marko Rosenmueller <5467316+dr75@users.noreply.github.com>
2025-03-26 11:11:28 -07:00
Alex Brooks
1711b929b6
[Model] Add Reasoning Parser for Granite Models (#14202)
Signed-off-by: Alex-Brooks <Alex.brooks@ibm.com>
Co-authored-by: Joe Runde <joe@joerun.de>
2025-03-26 14:28:07 +00:00
Harry Mellor
cf5c8f1686
Separate base model from TransformersModel (#15467)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Signed-off-by: Isotr0py <2037008807@qq.com>
Co-authored-by: Isotr0py <2037008807@qq.com>
Co-authored-by: Isotr0py <mozf@mail2.sysu.edu.cn>
2025-03-26 18:13:38 +08:00
Cyrus Leung
997c8811d6
[Model] Support multi-image for Molmo (#15438)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-03-26 11:26:33 +08:00
Cyrus Leung
3f04a7fbf2
[Doc] Update V1 user guide for multi-modality (#15460)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-03-25 11:01:58 +00:00
Harry Mellor
97cfa65df7
Add pipeline parallel support to TransformersModel (#12832)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Signed-off-by: Isotr0py <2037008807@qq.com>
Co-authored-by: Isotr0py <2037008807@qq.com>
2025-03-25 10:41:45 +08:00
Cyrus Leung
6dd55af6c9
[Doc] Update docs on handling OOM (#15357)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Signed-off-by: Roger Wang <ywang@roblox.com>
Co-authored-by: Roger Wang <ywang@roblox.com>
2025-03-24 14:29:34 -07:00
Yuan Tang
3eb08ed9b1
[DOC] Add Kubernetes deployment guide with CPUs (#14865) 2025-03-24 10:48:43 -07:00
Manish Sethi
761702fd19
[Core] Integrate fastsafetensors loader for loading model weights (#10647)
Signed-off-by: Manish Sethi <Manish.sethi1@ibm.com>
2025-03-24 08:08:02 -07:00
Jee Jee Li
3892e58ad7
[Misc] Upgrade BNB version (#15183) 2025-03-24 05:51:42 +00:00
Roger Wang
9c5c81b0da
[Misc][Doc] Add note regarding loading generation_config by default (#15281)
Signed-off-by: Roger Wang <ywang@roblox.com>
2025-03-23 14:00:55 -07:00
Robin
d6cd59f122
[Frontend] Support tool calling and reasoning parser (#14511)
Signed-off-by: WangErXiao <863579016@qq.com>
2025-03-23 14:00:07 -07:00
shangmingc
50c9636d87
[V1][Usage] Refactor speculative decoding configuration and tests (#14434)
Signed-off-by: Shangming Cai <caishangming@linux.alibaba.com>
2025-03-22 19:28:10 -10:00
Russell Bryant
b877031d80
Remove openvino support in favor of external plugin (#15339)
Signed-off-by: Russell Bryant <rbryant@redhat.com>
2025-03-22 14:06:39 -07:00
Naitong Yu
2f4bd358f1
[Model] Support Tele-FLM Model (#15023)
Signed-off-by: Naitong Yu <ntyu@baai.ac.cn>
Signed-off-by: jiangxin <horizon94@outlook.com>
Co-authored-by: Jason Fang <jasonfang3900@gmail.com>
Co-authored-by: jiangxin <horizon94@outlook.com>
2025-03-22 02:04:44 -07:00
Cyrus Leung
baec0d4de9
Revert "[Feature] specify model in config.yaml (#14855)" (#15293)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-03-21 08:30:23 -07:00
Russell Bryant
61e8c18350
[Misc] Add cProfile helpers (#15074)
Signed-off-by: Russell Bryant <rbryant@redhat.com>
2025-03-21 04:56:09 -07:00
Wei Zeng
0fa3970deb
[Feature] specify model in config.yaml (#14855)
Signed-off-by: weizeng <weizeng@roblox.com>
2025-03-21 00:26:03 -07:00
Edwin Hernandez
7297941b38
[Doc] Update LWS docs (#15163)
Signed-off-by: Edwinhr716 <Edandres249@gmail.com>
2025-03-20 21:18:47 -07:00
Harry Mellor
6edbfa924d
Mention extra_body as a way top pass vLLM only parameters using the OpenAI client (#15240)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-03-20 19:18:36 -07:00
Jee Jee Li
10f55fe6c5
[Misc] Clean up the BitsAndBytes arguments (#15140)
Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>
2025-03-20 19:17:12 -07:00
Rui Qiao
4cb1c05c9e
[Doc] Clarify run vllm only on one node in distributed inference (#15148)
Signed-off-by: Rui Qiao <ruisearch42@gmail.com>
2025-03-20 09:55:59 +08:00
Marc-Alexandre Côté
073d1ed354
[Doc] Update tip info on using latest transformers when creating a custom Dockerfile (#15070) 2025-03-19 13:33:40 +00:00
Cyrus Leung
61f412187d
[Bugfix] Re-enable Gemma3 for V1 (#14980)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-03-18 23:58:22 -07:00
Jennifer Zhao
228b768db6
[Doc] Minor v1_user_guide update (#15064)
Co-authored-by: Roger Wang <136131678+ywang96@users.noreply.github.com>
2025-03-18 16:10:45 -07:00
yury-tokpanov
452e8fd968
[MODEL] Add support for Zamba2 models (#13185)
Signed-off-by: Yury Tokpanov <yury@zyphra.com>
Signed-off-by: Quentin Anthony <qganthony@yahoo.com>
Co-authored-by: Quentin Anthony <qganthony@yahoo.com>
Co-authored-by: Tyler Michael Smith <tysmith@redhat.com>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
2025-03-18 08:56:21 -07:00
Patrick von Platen
f863ffc965
[Mistral-Small 3.1] Update docs and tests (#14977)
Signed-off-by: Roger Wang <ywang@roblox.com>
Co-authored-by: Roger Wang <ywang@roblox.com>
2025-03-18 03:29:42 -07:00