Harry Mellor
dd6a3a02cb
[Doc] Convert docs to use colon fences ( #12471 )
...
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-01-29 11:38:29 +08:00
Ce Gao
a7e3eba66f
[Frontend] Support reasoning content for deepseek r1 ( #12473 )
...
Signed-off-by: Ce Gao <cegao@tensorchord.ai>
Co-authored-by: Rafael Vasquez <rafvasq21@gmail.com>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
Co-authored-by: Michael Goin <mgoin@redhat.com>
2025-01-29 11:38:08 +08:00
Jun Duan
925d2f1908
[Doc] Fix typo for x86 CPU installation ( #12514 )
...
Signed-off-by: Jun Duan <jun.duan.phd@outlook.com>
2025-01-28 16:37:10 +00:00
Cyrus Leung
8f58a51358
[VLM] Merged multi-modal processor and V1 support for Qwen-VL ( #12504 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-01-28 16:25:05 +00:00
Yuan Tang
582cf78798
[DOC] Add link to vLLM blog ( #12460 )
...
Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>
2025-01-27 03:46:19 +00:00
Kyle Mistele
0034b09ceb
[Frontend] Rerank API (Jina- and Cohere-compatible API) ( #12376 )
...
Signed-off-by: Kyle Mistele <kyle@mistele.com>
2025-01-26 19:58:45 -07:00
Mohit Deopujari
9a0f3bdbe5
[Hardware][Gaudi][Doc] Add missing step in setup instructions ( #12382 )
2025-01-24 09:43:49 +00:00
Russell Bryant
c5cffcd0cd
[Docs] Update spec decode + structured output in compat matrix ( #12373 )
...
Signed-off-by: Russell Bryant <rbryant@redhat.com>
2025-01-24 01:15:52 +00:00
Woosuk Kwon
682b55bc07
[Docs] Add meetup slides ( #12345 )
...
Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
2025-01-23 14:10:03 -08:00
Isotr0py
2cbeedad09
[Docs] Document Phi-4 support ( #12362 )
...
Signed-off-by: Isotr0py <2037008807@qq.com>
2025-01-23 19:18:51 +00:00
Gregory Shtrasberg
e97f802b2d
[FP8][Kernel] Dynamic kv cache scaling factors computation ( #11906 )
...
Signed-off-by: Gregory Shtrasberg <Gregory.Shtrasberg@amd.com>
Co-authored-by: Micah Williamson <micah.williamson@amd.com>
2025-01-23 18:04:03 +00:00
Cyrus Leung
d07efb31c5
[Doc] Troubleshooting errors during model inspection ( #12351 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-01-23 22:46:58 +08:00
youkaichao
511627445e
[doc] explain common errors around torch.compile ( #12340 )
...
Signed-off-by: youkaichao <youkaichao@gmail.com>
2025-01-23 14:56:02 +08:00
Russell Bryant
7551a34032
[Docs] Document vulnerability disclosure process ( #12326 )
...
Signed-off-by: Russell Bryant <rbryant@redhat.com>
2025-01-23 03:44:09 +00:00
Michael Goin
01a55941f5
[Docs] Update FP8 KV Cache documentation ( #12238 )
...
Signed-off-by: mgoin <michael@neuralmagic.com>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
2025-01-23 11:18:09 +08:00
Hongxia Yang
09ccc9c8f7
[Documentation][AMD] Add information about prebuilt ROCm vLLM docker for perf validation purpose ( #12281 )
...
Signed-off-by: Hongxia Yang <hongxyan@amd.com>
2025-01-22 07:49:22 +08:00
Cyrus Leung
96912550c8
[Misc] Rename MultiModalInputsV2 -> MultiModalInputs
( #12244 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-01-21 07:31:19 +00:00
Gregory Shtrasberg
d4b62d4641
[AMD][Build] Porting dockerfiles from the ROCm/vllm fork ( #11777 )
...
Signed-off-by: Gregory Shtrasberg <Gregory.Shtrasberg@amd.com>
2025-01-21 12:22:23 +08:00
Isotr0py
83609791d2
[Model] Add Qwen2 PRM model support ( #12202 )
...
Signed-off-by: Isotr0py <2037008807@qq.com>
2025-01-20 14:59:46 +08:00
Harry Mellor
3ea7b94523
Move linting to pre-commit
( #11975 )
...
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-01-20 14:58:01 +08:00
Roger Wang
81763c58a0
[V1] Add V1 support of Qwen2-VL ( #12128 )
...
Signed-off-by: Roger Wang <ywang@roblox.com>
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Co-authored-by: imkero <kerorek@outlook.com>
Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-01-19 19:52:13 +08:00
Isotr0py
02798ecabe
[Model] Port deepseek-vl2 processor, remove dependency ( #12169 )
...
Signed-off-by: Isotr0py <2037008807@qq.com>
2025-01-18 13:59:39 +08:00
Hongxia Yang
c09503ddd6
[AMD][CI/Build][Bugfix] use pytorch stale wheel ( #12172 )
...
Signed-off-by: hongxyan <hongxyan@amd.com>
2025-01-18 11:15:53 +08:00
Yuan Tang
1475847a14
[Doc] Add instructions on using Podman when SELinux is active ( #12136 )
...
Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>
2025-01-17 04:45:36 +00:00
Isotr0py
62b06ba23d
[Model] Add support for deepseek-vl2-tiny model ( #12068 )
...
Signed-off-by: Isotr0py <2037008807@qq.com>
2025-01-16 17:14:48 +00:00
Cyrus Leung
f8ef146f03
[Doc] Add documentation for specifying model architecture ( #12105 )
2025-01-16 15:53:43 +08:00
RunningLeon
97eb97b5a4
[Model]: Support internlm3 ( #12037 )
2025-01-15 11:35:17 +00:00
Kyle Sayers
3f9b7ab9f5
[Doc] Update examples to remove SparseAutoModelForCausalLM ( #12062 )
...
Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>
2025-01-15 06:36:01 +00:00
Harry Mellor
c9d6ff530b
Explain where the engine args go when using Docker ( #12041 )
...
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-01-14 16:05:50 +00:00
TJian
8a1f938e6f
[Doc] Update Quantization Hardware Support Documentation ( #12025 )
...
Signed-off-by: tjtanaa <tunjian.tan@embeddedllm.com>
Co-authored-by: tjtanaa <tunjian.tan@embeddedllm.com>
2025-01-14 04:37:52 +00:00
Woosuk Kwon
1a401252b5
[Docs] Add Sky Computing Lab to project intro ( #12019 )
...
Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
2025-01-13 17:24:36 -08:00
Harry Mellor
e8c23ff989
[Doc] Organise installation documentation into categories and tabs ( #11935 )
...
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-01-13 12:27:36 +00:00
Roger Wang
cd8249903f
[Doc][V1] Update model implementation guide for V1 support ( #11998 )
...
Signed-off-by: Roger Wang <ywang@roblox.com>
Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>
2025-01-13 11:58:54 +00:00
Akshat Tripathi
8bddb73512
[Hardware][CPU] Multi-LoRA implementation for the CPU backend ( #11100 )
...
Signed-off-by: Akshat Tripathi <akshat@krai.ai>
Signed-off-by: Oleg Mosalov <oleg@krai.ai>
Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>
Co-authored-by: Oleg Mosalov <oleg@krai.ai>
Co-authored-by: Jee Jee Li <pandaleefree@gmail.com>
Co-authored-by: Isotr0py <2037008807@qq.com>
2025-01-12 13:01:52 +00:00
Isotr0py
f967e51f38
[Model] Initialize support for Deepseek-VL2 models ( #11578 )
...
Signed-off-by: Isotr0py <2037008807@qq.com>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
2025-01-12 00:17:24 -08:00
Rafael Vasquez
43f3d9e699
[CI/Build] Add markdown linter ( #11857 )
...
Signed-off-by: Rafael Vasquez <rafvasq21@gmail.com>
2025-01-12 00:17:13 -08:00
Cyrus Leung
a991f7d508
[Doc] Basic guide for writing unit tests for new models ( #11951 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-01-11 21:27:24 +08:00
Li, Jiang
aa1e77a19c
[Hardware][CPU] Support MOE models on x86 CPU ( #11831 )
...
Signed-off-by: jiang1.li <jiang1.li@intel.com>
2025-01-10 11:07:58 -05:00
Harry Mellor
482cdc494e
[Doc] Rename offline inference examples ( #11927 )
...
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-01-10 23:50:29 +08:00
Cyrus Leung
12664ddda5
[Doc] [1/N] Initial guide for merged multi-modal processor ( #11925 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-01-10 14:30:25 +00:00
Harry Mellor
d85c47d6ad
Replace "online inference" with "online serving" ( #11923 )
...
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-01-10 12:05:56 +00:00
Cyrus Leung
3de2b1eafb
[Doc] Show default pooling method in a table ( #11904 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-01-10 11:25:20 +08:00
Cyrus Leung
c3cf54dda4
[Doc][5/N] Move Community and API Reference to the bottom ( #11896 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Co-authored-by: Simon Mo <simon.mo@hey.com>
2025-01-10 03:10:12 +00:00
Charles Frye
36f5303578
[Docs] Add Modal to deployment frameworks ( #11907 )
2025-01-09 23:26:37 +00:00
Cyrus Leung
9a228348d2
[Misc] Provide correct Pixtral-HF chat template ( #11891 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-01-09 10:19:37 -07:00
Cyrus Leung
65097ca0af
[Doc] Add model development API Reference ( #11884 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-01-09 09:43:40 +00:00
Guspan Tanadi
a732900efc
[Doc] Intended links Python multiprocessing library ( #11878 )
2025-01-09 05:39:39 +00:00
Michael Goin
730e9592e9
[Doc] Recommend uv and python 3.12 for quickstart guide ( #11849 )
...
Signed-off-by: mgoin <michael@neuralmagic.com>
2025-01-09 11:37:48 +08:00
Cyrus Leung
5984499e47
[Doc] Expand Multimodal API Reference ( #11852 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-01-08 17:14:14 +00:00
Cyrus Leung
6cd40a5bfe
[Doc][4/N] Reorganize API Reference ( #11843 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-01-08 21:34:44 +08:00