Brayden Zhong
|
2aed2c9fa7
|
[Doc] Fix ROCm documentation (#14041)
Signed-off-by: Brayden Zhong <b8zhong@uwaterloo.ca>
|
2025-02-28 16:42:07 +00:00 |
|
Harry Mellor
|
f58f8b5c96
|
Update AutoAWQ docs (#14042)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-02-28 15:20:29 +00:00 |
|
Cyrus Leung
|
1088f06242
|
[Doc] Move multimodal Embedding API example to Online Serving page (#14017)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-02-28 07:12:04 +00:00 |
|
Cyrus Leung
|
f1579b229d
|
[VLM] Generalized prompt updates for multi-modal processor (#13964)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-02-27 17:44:25 +00:00 |
|
王博伟
|
512d77d582
|
Update quickstart.md (#13958)
|
2025-02-27 16:05:11 +00:00 |
|
Szymon Ożóg
|
7f0be2aa24
|
[Model] Deepseek GGUF support (#13167)
|
2025-02-27 02:08:35 -08:00 |
|
Isotr0py
|
edf309ebbe
|
[VLM] Support multimodal inputs for Florence-2 models (#13320)
|
2025-02-27 02:06:41 -08:00 |
|
Michael Goin
|
ca377cf1b9
|
Use CUDA 12.4 as default for release and nightly wheels (#12098)
|
2025-02-26 19:06:37 -08:00 |
|
Jee Jee Li
|
5157338ed9
|
[Misc] Improve LoRA spelling (#13831)
|
2025-02-25 23:43:01 -08:00 |
|
Michael Goin
|
07c4353057
|
[Model] Support Grok1 (#13795)
Signed-off-by: mgoin <mgoin64@gmail.com>
|
2025-02-26 01:07:12 +00:00 |
|
Harry Mellor
|
cdc1fa12eb
|
Remove unused kwargs from model definitions (#13555)
|
2025-02-24 17:13:52 -08:00 |
|
Nicolò Lucchesi
|
444b0f0f62
|
[Misc][Docs] Raise error when flashinfer is not installed and VLLM_ATTENTION_BACKEND is set (#12513)
Signed-off-by: NickLucche <nlucches@redhat.com>
|
2025-02-24 10:43:21 -05:00 |
|
Cyrus Leung
|
8354f6640c
|
[Doc] Dockerfile instructions for optional dependencies and dev transformers (#13699)
|
2025-02-22 06:04:31 -08:00 |
|
Mark McLoughlin
|
2cb8c1540e
|
[Metrics] Add --show-hidden-metrics-for-version CLI arg (#13295)
|
2025-02-22 00:20:45 -08:00 |
|
Yuan Tang
|
8c0dd3d4df
|
docs: Add a note on full CI run in contributing guide (#13646)
|
2025-02-21 21:53:59 -08:00 |
|
Gabriel Marinho
|
1c3c975766
|
[FEATURE] Enables /score endpoint for embedding models (#12846)
|
2025-02-20 22:09:47 -08:00 |
|
Kante Yin
|
44c33f01f3
|
Add llmaz as another integration (#13643)
Signed-off-by: kerthcet <kerthcet@gmail.com>
|
2025-02-21 03:52:40 +00:00 |
|
Joe Runde
|
bfbc0b32c6
|
[Frontend] Add backend-specific options for guided decoding (#13505)
Signed-off-by: Joe Runde <Joseph.Runde@ibm.com>
|
2025-02-20 15:07:58 -05:00 |
|
Harry Mellor
|
992e5c3d34
|
Merge similar examples in offline_inference into single basic example (#12737)
|
2025-02-20 04:53:51 -08:00 |
|
Jee Jee Li
|
512368e34a
|
[Misc] Qwen2.5 VL support LoRA (#13261)
|
2025-02-19 18:37:55 -08:00 |
|
Wilson Wu
|
01c184b8f3
|
Fix copyright year to auto get current year (#13561)
|
2025-02-19 16:55:34 +00:00 |
|
youkaichao
|
ad5a35c21b
|
[doc] clarify multi-node serving doc (#13558)
Signed-off-by: youkaichao <youkaichao@gmail.com>
|
2025-02-19 22:32:17 +08:00 |
|
youkaichao
|
52ce14d31f
|
[doc] clarify profiling is only for developers (#13554)
Signed-off-by: youkaichao <youkaichao@gmail.com>
|
2025-02-19 20:55:58 +08:00 |
|
Roger Wang
|
fd84857f64
|
[Doc] Add clarification note regarding paligemma (#13511)
|
2025-02-18 22:24:03 -08:00 |
|
Harry Mellor
|
00b69c2d27
|
[Misc] Remove dangling references to --use-v2-block-manager (#13492)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-02-19 03:37:26 +00:00 |
|
youkaichao
|
7b203b7694
|
[misc] fix debugging code (#13487)
Signed-off-by: youkaichao <youkaichao@gmail.com>
|
2025-02-18 09:37:11 -08:00 |
|
Harry Mellor
|
2358ca527b
|
[Doc]: Improve feature tables (#13224)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-02-18 18:52:39 +08:00 |
|
Isotr0py
|
67ef8f666a
|
[Model] Enable quantization support for transformers backend (#12960)
|
2025-02-17 19:52:47 -08:00 |
|
Cyrus Leung
|
7b623fca0b
|
[VLM] Check required fields before initializing field config in DictEmbeddingItems (#13380)
|
2025-02-17 01:36:07 -08:00 |
|
yankooo
|
f857311d13
|
Fix spelling error in index.md (#13369)
|
2025-02-17 06:53:20 +00:00 |
|
shangmingc
|
46cdd59577
|
[Feature][Spec Decode] Simplify the use of Eagle Spec Decode (#12304)
Signed-off-by: Shangming Cai <caishangming@linux.alibaba.com>
|
2025-02-16 19:32:26 -08:00 |
|
凌
|
da833b0aee
|
[Docs] Change myenv to vllm. Update python_env_setup.inc.md (#13325)
|
2025-02-16 16:04:21 +00:00 |
|
Roger Wang
|
b7d309860e
|
[V1] Update doc and examples for H2O-VL (#13349)
Signed-off-by: Roger Wang <ywang@roblox.com>
|
2025-02-16 10:35:54 +00:00 |
|
Cyrus Leung
|
367cb8ce8c
|
[Doc] [2/N] Add Fuyu E2E example for multimodal processor (#13331)
|
2025-02-15 07:06:23 -08:00 |
|
Nicolò Lucchesi
|
579d7a63b2
|
[Bugfix][Docs] Fix offline Whisper (#13274)
|
2025-02-14 21:32:37 -08:00 |
|
Nicolò Lucchesi
|
d84cef76eb
|
[Frontend] Add /v1/audio/transcriptions OpenAI API endpoint (#12909)
|
2025-02-13 07:23:45 -08:00 |
|
Cyrus Leung
|
1bc3b5e71b
|
[VLM] Separate text-only and vision variants of the same model architecture (#13157)
|
2025-02-13 06:19:15 -08:00 |
|
Cyrus Leung
|
c9d3ecf016
|
[VLM] Merged multi-modal processor for Molmo (#12966)
|
2025-02-13 04:34:00 -08:00 |
|
Russell Bryant
|
d46d490c27
|
[Frontend] Move CLI code into vllm.cmd package (#12971)
|
2025-02-12 23:12:21 -08:00 |
|
Cody Yu
|
60c68df6d1
|
[Build] Automatically use the wheel of the base commit with Python-only build (#13178)
|
2025-02-12 23:10:28 -08:00 |
|
Harry Mellor
|
deb6c1c6b4
|
[Doc] Improve OpenVINO installation doc (#13102)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-02-11 18:02:46 +00:00 |
|
Farzad Abdolhosseini
|
08b2d845d6
|
[Model] Ultravox Model: Support v0.5 Release (#12912)
Signed-off-by: Farzad Abdolhosseini <farzad@fixie.ai>
|
2025-02-10 22:02:48 +00:00 |
|
Cyrus Leung
|
51f0b5f7f6
|
[Bugfix] Clean up and fix multi-modal processors (#13012)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-02-10 10:45:21 +00:00 |
|
Yuan Tang
|
243137143c
|
[Doc] Add link to tool_choice tracking issue in tool_calling.md (#13003)
Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>
|
2025-02-10 06:09:33 +00:00 |
|
Jee Jee Li
|
86222a3dab
|
[VLM] Merged multi-modal processor for GLM4V (#12449)
Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>
|
2025-02-08 20:32:16 +00:00 |
|
Cyrus Leung
|
8a69e0e20e
|
[CI/Build] Auto-fix Markdown files (#12941)
|
2025-02-08 04:25:15 -08:00 |
|
Jun Duan
|
256a2d29dc
|
[Doc] Correct HF repository for TeleChat2 models (#12949)
|
2025-02-08 01:42:15 -08:00 |
|
TJian
|
eaa92d4437
|
[ROCm] [Feature] [Doc] [Dockerfile] [BugFix] Support Per-Token-Activation Per-Channel-Weight FP8 Quantization Inferencing (#12501)
|
2025-02-07 08:13:43 -08:00 |
|
Jitse Klomp
|
afe74f7a96
|
[Doc] double quote cmake package in build.inc.md (#12840)
|
2025-02-06 09:17:55 -08:00 |
|
Sumit Vij
|
d88506dda4
|
[Model] LoRA Support for Ultravox model (#11253)
|
2025-02-05 19:54:13 -08:00 |
|