Mark McLoughlin
98d7367b61
[Metrics] Hide deprecated metrics ( #15458 )
...
Signed-off-by: Mark McLoughlin <markmc@redhat.com>
2025-04-02 07:37:19 -07:00
Ilya Markov
b7b7676d67
[Distributed] Add custom allreduce support for ROCM ( #14125 )
...
Signed-off-by: ilmarkov <imarkov@redhat.com>
Co-authored-by: ilmarkov <imarkov@redhat.com>
2025-03-31 22:49:12 -07:00
Michael Goin
47e9038d23
Fix cpu offload testing for gptq/awq/ct ( #15648 )
...
Signed-off-by: mgoin <mgoin64@gmail.com>
2025-03-29 00:29:32 +08:00
Russell Bryant
b877031d80
Remove openvino support in favor of external plugin ( #15339 )
...
Signed-off-by: Russell Bryant <rbryant@redhat.com>
2025-03-22 14:06:39 -07:00
vllmellm
2bb0e1a799
[Bugfix][ROCm] running new process using spawn method for rocm in tests. ( #14810 )
...
Signed-off-by: vllmellm <vllm.ellm@embeddedllm.com>
Signed-off-by: tjtanaa <tunjian.tan@embeddedllm.com>
Co-authored-by: TJian <tunjian.tan@embeddedllm.com>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
2025-03-17 11:33:35 +00:00
Sibi
a73e183e36
[Misc] Replace os environ to monkeypatch in test suite ( #14516 )
...
Signed-off-by: sibi <85477603+t-sibiraj@users.noreply.github.com>
Signed-off-by: Aaron Pham <contact@aarnphm.xyz>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
Co-authored-by: Aaron Pham <contact@aarnphm.xyz>
2025-03-16 20:35:57 -07:00
Cyrus Leung
33f227e16b
[CI/Build] Use a fixed seed to avoid flaky tests ( #14480 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-03-08 11:30:09 +00:00
Ilya Lavrenov
8ca7a71df7
OpenVINO: added CPU-like conditions ( #14338 )
...
Signed-off-by: Ilya Lavrenov <ilya.lavrenov@intel.com>
2025-03-06 22:24:49 -08:00
Harry Mellor
cf069aa8aa
Update deprecated Python 3.8 typing ( #13971 )
2025-03-02 17:34:51 -08:00
Jongseok Park
781096e385
Expert Parallelism (EP) Support for DeepSeek V2 ( #12583 )
2025-02-24 07:33:20 -08:00
youkaichao
cf797aa856
[core] port pynvml into vllm codebase ( #12963 )
...
Signed-off-by: youkaichao <youkaichao@gmail.com>
2025-02-09 15:00:00 +08:00
Russell Bryant
e489ad7a21
[Misc] Add SPDX-License-Identifier headers to python source files ( #12628 )
...
- **Add SPDX license headers to python source files**
- **Check for SPDX headers using pre-commit**
commit 9d7ef44c3cfb72ca4c32e1c677d99259d10d4745
Author: Russell Bryant <rbryant@redhat.com>
Date: Fri Jan 31 14:18:24 2025 -0500
Add SPDX license headers to python source files
This commit adds SPDX license headers to python source files as
recommended to
the project by the Linux Foundation. These headers provide a concise way
that is
both human and machine readable for communicating license information
for each
source file. It helps avoid any ambiguity about the license of the code
and can
also be easily used by tools to help manage license compliance.
The Linux Foundation runs license scans against the codebase to help
ensure
we are in compliance with the licenses of the code we use, including
dependencies. Having these headers in place helps that tool do its job.
More information can be found on the SPDX site:
- https://spdx.dev/learn/handling-license-info/
Signed-off-by: Russell Bryant <rbryant@redhat.com>
commit 5a1cf1cb3b80759131c73f6a9dddebccac039dea
Author: Russell Bryant <rbryant@redhat.com>
Date: Fri Jan 31 14:36:32 2025 -0500
Check for SPDX headers using pre-commit
Signed-off-by: Russell Bryant <rbryant@redhat.com>
---------
Signed-off-by: Russell Bryant <rbryant@redhat.com>
2025-02-02 11:58:18 -08:00
youkaichao
bd82872211
[ci]try to fix flaky multi-step tests ( #11894 )
...
Signed-off-by: youkaichao <youkaichao@gmail.com>
2025-01-09 14:47:29 +00:00
Joe Runde
2d1b9baa8f
[Bugfix] Fix request cancellation without polling ( #11190 )
2024-12-17 12:26:32 -08:00
Cyrus Leung
d1e21a979b
[CI/Build] Split up VLM tests ( #11083 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2024-12-12 06:18:16 +08:00
Flávia Béo
aa9078fa03
Adds method to read the pooling types from model's files ( #9506 )
...
Signed-off-by: Flavia Beo <flavia.beo@ibm.com>
Signed-off-by: Max de Bayser <mbayser@br.ibm.com>
Co-authored-by: Max de Bayser <mbayser@br.ibm.com>
2024-11-07 08:42:40 +00:00
Isotr0py
a5fda50a10
[CI/Build] Fix large_gpu_mark reason ( #10070 )
...
Signed-off-by: Isotr0py <2037008807@qq.com>
2024-11-06 08:50:37 +00:00
youkaichao
566cd27797
[torch.compile] rework test plans ( #9866 )
...
Signed-off-by: youkaichao <youkaichao@gmail.com>
2024-10-31 22:20:17 -07:00
Yongzao
00d91c8a2c
[CI/Build] Simplify exception trace in api server tests ( #9787 )
...
Signed-off-by: youkaichao <youkaichao@gmail.com>
Co-authored-by: youkaichao <youkaichao@gmail.com>
2024-10-30 14:52:05 -07:00
Alex Brooks
cc98f1e079
[CI/Build] VLM Test Consolidation ( #9372 )
...
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com>
2024-10-30 09:32:17 -07:00
wangshuai09
4e2d95e372
[Hardware][ROCM] using current_platform.is_rocm ( #9642 )
...
Signed-off-by: wangshuai09 <391746016@qq.com>
2024-10-28 04:07:00 +00:00
Cyrus Leung
f085995a7b
[CI/Build] Remove unnecessary fork_new_process
( #9484 )
2024-10-21 19:47:29 -07:00
Cyrus Leung
051eaf6db3
[Model] Add user-configurable task for models that support both generation and embedding ( #9424 )
2024-10-18 11:31:58 -07:00
Kuntai Du
81ede99ca4
[Core] Deprecating block manager v1 and make block manager v2 default ( #8704 )
...
Removing the block manager v1. This is the initial piece of prefix-caching-centric design. In order to achieve prefix-caching-centric design, we need to simplify the code path so that we only use v2 block manager (which has much higher performance on prefix caching).
2024-10-17 11:38:15 -05:00
sroy745
f3a507f1d3
[Core] Add an environment variable which needs to be set explicitly to allow BlockSpaceManagerV1 ( #9149 )
2024-10-10 14:17:17 +08:00
youkaichao
c8627cd41b
[ci][test] use load dummy for testing ( #9165 )
2024-10-09 00:38:40 -07:00
youkaichao
04c12f8157
[misc] update utils to support comparing multiple settings ( #9140 )
2024-10-08 02:51:49 +00:00
Cyrus Leung
b22b798471
[Model] PP support for embedding models and update docs ( #9090 )
...
Co-authored-by: Roger Wang <136131678+ywang96@users.noreply.github.com>
2024-10-06 16:35:27 +08:00
Murali Andoorveedu
0f6d7a9a34
[Models] Add remaining model PP support ( #7168 )
...
Signed-off-by: Muralidhar Andoorveedu <muralidhar.andoorveedu@centml.ai>
Signed-off-by: Murali Andoorveedu <muralidhar.andoorveedu@centml.ai>
Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>
2024-10-04 10:56:58 +08:00
Cyrus Leung
26a68d5d7e
[CI/Build] Add test decorator for minimum GPU memory ( #8925 )
2024-09-29 02:50:51 +00:00
Varun Sundar Rabindranath
19d02ff938
[Bugfix] Fix PP for Multi-Step ( #8887 )
2024-09-28 08:52:46 -07:00
afeldman-nm
a8c1d161a7
[Core] *Prompt* logprobs support in Multi-step ( #8199 )
2024-09-18 08:38:43 -07:00
Alexander Matveev
7c7714d856
[Core][Bugfix][Perf] Introduce MQLLMEngine
to avoid asyncio
OH ( #8157 )
...
Co-authored-by: Nick Hill <nickhill@us.ibm.com>
Co-authored-by: rshaw@neuralmagic.com <rshaw@neuralmagic.com>
Co-authored-by: Robert Shaw <114415538+robertgshaw2-neuralmagic@users.noreply.github.com>
Co-authored-by: Simon Mo <simon.mo@hey.com>
2024-09-18 13:56:58 +00:00
Cyrus Leung
a84e598e21
[CI/Build] Reorganize models tests ( #7820 )
2024-09-13 10:20:06 -07:00
shangmingc
40c396533d
[Bugfix] Mapping physical device indices for e2e test utils ( #8290 )
2024-09-13 11:06:28 +08:00
Cyrus Leung
9f68e00d27
[Bugfix] Fix broken OpenAI tensorizer test ( #8258 )
2024-09-07 08:02:39 +00:00
manikandan.tm@zucisystems.com
8685ba1a1e
Inclusion of InternVLChatModel In PP_SUPPORTED_MODELS(Pipeline Parallelism) ( #7860 )
2024-09-05 11:33:37 +00:00
afeldman-nm
428dd1445e
[Core] Logprobs support in Multi-step ( #7652 )
2024-08-29 19:19:08 -07:00
Nick Hill
39178c7fbc
[Tests] Disable retries and use context manager for openai client ( #7565 )
2024-08-26 21:33:17 -07:00
Cyrus Leung
029c71de11
[CI/Build] Avoid downloading all HF files in RemoteOpenAIServer
( #7836 )
2024-08-26 05:31:10 +00:00
youkaichao
aab0fcdb63
[ci][test] fix RemoteOpenAIServer ( #7838 )
2024-08-24 17:31:28 +00:00
youkaichao
ea9fa160e3
[ci][test] exclude model download time in server start time ( #7834 )
2024-08-24 01:03:27 -07:00
youkaichao
9e51b6a626
[ci][test] adjust max wait time for cpu offloading test ( #7709 )
2024-08-20 17:12:44 -07:00
youkaichao
832163b875
[ci][test] allow longer wait time for api server ( #7629 )
2024-08-17 11:26:38 -07:00
SangBin Cho
4706eb628e
[aDAG] Unflake aDAG + PP tests ( #7600 )
2024-08-16 20:49:30 -07:00
youkaichao
ea49e6a3c8
[misc][ci] fix cpu test with plugins ( #7489 )
2024-08-13 19:27:46 -07:00
Cyrus Leung
9ba85bc152
[mypy] Misc. typing improvements ( #7417 )
2024-08-13 09:20:20 +08:00
Cyrus Leung
66d617e343
[Frontend] Gracefully handle missing chat template and fix CI failure ( #7238 )
...
Co-authored-by: Roger Wang <ywang@roblox.com>
2024-08-07 09:12:05 +00:00
Michael Goin
f9a5600649
[Bugfix] Fix GPTQ and GPTQ Marlin CPU Offloading ( #7225 )
2024-08-06 18:34:26 -07:00
youkaichao
44dcb52e39
[ci][test] finalize fork_new_process_for_each_test ( #7114 )
2024-08-03 10:44:53 -07:00