Wallas Henrique
cfd3219f58
[Hardware][Apple] Native support for macOS Apple Silicon ( #11696 )
...
Signed-off-by: Wallas Santos <wallashss@ibm.com>
Co-authored-by: Michael Goin <michael@neuralmagic.com>
2025-01-08 16:35:49 +08:00
youkaichao
ad9f1aa679
[doc] update wheels url ( #11830 )
...
Signed-off-by: youkaichao <youkaichao@gmail.com>
2025-01-08 14:36:49 +08:00
hj-wei
59d6bb4c86
[Hardware][AMD]: Replace HIPCC version with more precise ROCm version ( #11515 )
...
Signed-off-by: hjwei <hjwei_xd@163.com>
2024-12-28 11:17:35 +00:00
Yuan Tang
409475a827
[Bugfix] Fix issues in CPU build Dockerfile. Fixes #9182 ( #11435 )
...
Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>
2024-12-24 16:53:28 +00:00
omer-dayan
995f56236b
[Core] Loading model from S3 using RunAI Model Streamer as optional loader ( #10192 )
...
Signed-off-by: OmerD <omer@run.ai>
2024-12-20 16:46:24 +00:00
Daniele
7c7aa37c69
[CI/Build] fix pre-compiled wheel install for exact tag ( #11373 )
...
Signed-off-by: Daniele Trifirò <dtrifiro@redhat.com>
2024-12-21 00:14:40 +08:00
youkaichao
9743d64e4e
[ci][build] add tests for python only compilation ( #10915 )
...
Signed-off-by: youkaichao <youkaichao@gmail.com>
2024-12-05 08:54:47 -08:00
Daniele
e4c34c23de
[CI/Build] improve python-only dev setup ( #9621 )
...
Signed-off-by: Daniele Trifirò <dtrifiro@redhat.com>
Signed-off-by: youkaichao <youkaichao@gmail.com>
Co-authored-by: youkaichao <youkaichao@gmail.com>
2024-12-04 21:48:13 +00:00
litianjian
28b2877d30
Online video support for VLMs ( #10020 )
...
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Co-authored-by: litianjian <litianjian@bytedance.com>
Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>
2024-11-07 20:25:59 +00:00
Russell Bryant
098f94de42
[CI/Build] Drop Python 3.8 support ( #10038 )
...
Signed-off-by: Russell Bryant <rbryant@redhat.com>
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>
2024-11-06 14:31:01 +00:00
Woosuk Kwon
6a585a23d2
[Hotfix] Fix ruff errors ( #10073 )
...
Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
2024-11-06 01:24:28 -08:00
Konrad Zawora
a02a50e6e5
[Hardware][Intel-Gaudi] Add Intel Gaudi (HPU) inference backend ( #6143 )
...
Signed-off-by: yuwenzho <yuwen.zhou@intel.com>
Signed-off-by: Chendi.Xue <chendi.xue@intel.com>
Signed-off-by: Bob Zhu <bob.zhu@intel.com>
Signed-off-by: zehao-intel <zehao.huang@intel.com>
Signed-off-by: Konrad Zawora <kzawora@habana.ai>
Co-authored-by: Kunshang Ji <kunshang.ji@intel.com>
Co-authored-by: Sanju C Sudhakaran <scsudhakaran@habana.ai>
Co-authored-by: Michal Adamczyk <madamczyk@habana.ai>
Co-authored-by: Marceli Fylcek <mfylcek@habana.ai>
Co-authored-by: Himangshu Lahkar <49579433+hlahkar@users.noreply.github.com>
Co-authored-by: Vivek Goel <vgoel@habana.ai>
Co-authored-by: yuwenzho <yuwen.zhou@intel.com>
Co-authored-by: Dominika Olszewska <dolszewska@habana.ai>
Co-authored-by: barak goldberg <149692267+bgoldberg-habana@users.noreply.github.com>
Co-authored-by: Michal Szutenberg <37601244+szutenberg@users.noreply.github.com>
Co-authored-by: Jan Kaniecki <jkaniecki@habana.ai>
Co-authored-by: Agata Dobrzyniewicz <160237065+adobrzyniewicz-habana@users.noreply.github.com>
Co-authored-by: Krzysztof Wisniewski <kwisniewski@habana.ai>
Co-authored-by: Dudi Lester <160421192+dudilester@users.noreply.github.com>
Co-authored-by: Ilia Taraban <tarabanil@gmail.com>
Co-authored-by: Chendi.Xue <chendi.xue@intel.com>
Co-authored-by: Michał Kuligowski <mkuligowski@habana.ai>
Co-authored-by: Jakub Maksymczuk <jmaksymczuk@habana.ai>
Co-authored-by: Tomasz Zielinski <85164140+tzielinski-habana@users.noreply.github.com>
Co-authored-by: Sun Choi <schoi@habana.ai>
Co-authored-by: Iryna Boiko <iboiko@habana.ai>
Co-authored-by: Bob Zhu <41610754+czhu15@users.noreply.github.com>
Co-authored-by: hlin99 <73271530+hlin99@users.noreply.github.com>
Co-authored-by: Zehao Huang <zehao.huang@intel.com>
Co-authored-by: Andrzej Kotłowski <Andrzej.Kotlowski@intel.com>
Co-authored-by: Yan Tomsinsky <73292515+Yantom1@users.noreply.github.com>
Co-authored-by: Nir David <ndavid@habana.ai>
Co-authored-by: Yu-Zhou <yu.zhou@intel.com>
Co-authored-by: Ruheena Suhani Shaik <rsshaik@habana.ai>
Co-authored-by: Karol Damaszke <kdamaszke@habana.ai>
Co-authored-by: Marcin Swiniarski <mswiniarski@habana.ai>
Co-authored-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
Co-authored-by: Jacek Czaja <jacek.czaja@intel.com>
Co-authored-by: Jacek Czaja <jczaja@habana.ai>
Co-authored-by: Yuan <yuan.zhou@outlook.com>
2024-11-06 01:09:10 -08:00
Aaron Pham
21063c11c7
[CI/Build] drop support for Python 3.8 EOL ( #8464 )
...
Signed-off-by: Aaron Pham <contact@aarnphm.xyz>
2024-11-06 07:11:55 +00:00
Luka Govedič
51c24c9736
[Build] Fix FetchContent
multiple build issue ( #9596 )
...
Signed-off-by: luka <luka@neuralmagic.com>
2024-10-23 12:43:07 +08:00
bnellnm
eca2c5f7c0
[Bugfix] Fix support for dimension like integers and ScalarType ( #9299 )
2024-10-17 19:08:34 +00:00
Joe Rowell
7627172bf4
[Bugfix][Doc] Report neuron error in output ( #9159 )
2024-10-08 22:43:34 -07:00
Yuan Tang
ffc4b27ea8
Add classifiers in setup.py ( #9171 )
2024-10-08 19:30:48 -07:00
Roger Wang
b6d7392579
[Misc][CI/Build] Include cv2
via mistral_common[opencv]
( #8951 )
2024-09-30 04:28:26 +00:00
Tyler Titsworth
260024a374
[Bugfix][Intel] Fix XPU Dockerfile Build ( #7824 )
...
Signed-off-by: tylertitsworth <tyler.titsworth@intel.com>
Co-authored-by: youkaichao <youkaichao@126.com>
2024-09-27 23:45:50 -07:00
Daniele
2467b642dd
[CI/Build] fix setuptools-scm usage ( #8771 )
2024-09-24 12:38:12 -07:00
Daniele
ee5f34b1c2
[CI/Build] use setuptools-scm to set __version__ ( #4738 )
...
Co-authored-by: youkaichao <youkaichao@126.com>
2024-09-23 09:44:26 -07:00
youkaichao
0e40ac9b7b
[ci][build] fix vllm-flash-attn ( #8699 )
2024-09-21 23:24:58 -07:00
Luka Govedič
71c60491f2
[Kernel] Build flash-attn from source ( #8245 )
2024-09-20 23:27:10 -07:00
Simon Mo
5ce45eb54d
[misc] small qol fixes for release process ( #8517 )
2024-09-16 15:11:27 -07:00
Charlie Fu
1ef0d2efd0
[Kernel][Hardware][Amd]Custom paged attention kernel for rocm ( #8310 )
2024-09-13 17:01:11 -07:00
Yangshen⚡Deng
6a512a00df
[model] Support for Llava-Next-Video model ( #7559 )
...
Co-authored-by: Roger Wang <ywang@roblox.com>
Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
2024-09-10 22:21:36 -07:00
Daniele
6234385f4a
[CI/Build] enable ccache/scccache for HIP builds ( #8327 )
2024-09-10 08:55:08 -07:00
tomeras91
c02638efb3
[CI/Build] make pip install vllm work in macos (for import only) ( #8118 )
2024-09-03 12:37:08 -07:00
Roger Wang
5b86b19954
[Misc] Optional installation of audio related packages ( #8063 )
2024-09-01 14:46:57 -07:00
sasha0552
1b32e02648
[Bugfix] Pass PYTHONPATH from setup.py to CMake ( #7730 )
2024-08-21 11:17:48 -07:00
Kunshang Ji
1a36287b89
[Bugfix] Fix xpu build ( #7644 )
2024-08-18 22:00:09 -07:00
tomeras91
386087970a
[CI/Build] build on empty device for better dev experience ( #4773 )
2024-08-11 13:09:44 -07:00
Ilya Lavrenov
80cbe10c59
[OpenVINO] migrate to latest dependencies versions ( #7251 )
2024-08-07 09:49:10 -07:00
Lucas Wilkinson
a8d604ca2a
[Misc] Disambiguate quantized types via a new ScalarType ( #6396 )
2024-08-02 13:51:58 -07:00
Michael Goin
b482b9a5b1
[CI/Build] Add support for Python 3.12 ( #7035 )
2024-08-02 13:51:22 -07:00
Jee Jee Li
7ecee34321
[Kernel][RFC] Refactor the punica kernel based on Triton ( #5036 )
2024-07-31 17:12:24 -07:00
Ethan Xu
dbfe254eda
[Feature] vLLM CLI ( #5090 )
...
Co-authored-by: simon-mo <simon.mo@hey.com>
2024-07-14 15:36:43 -07:00
youkaichao
ccd3c04571
[ci][build] fix commit id ( #6420 )
...
Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>
2024-07-14 22:16:21 +08:00
Michael Goin
111fc6e7ec
[Misc] Add generated git commit hash as vllm.__commit__
( #6386 )
2024-07-12 22:52:15 +00:00
Ilya Lavrenov
57f09a419c
[Hardware][Intel] OpenVINO vLLM backend ( #5379 )
2024-06-28 13:50:16 +00:00
Kunshang Ji
728c4c8a06
[Hardware][Intel GPU] Add Intel GPU(XPU) inference backend ( #3814 )
...
Co-authored-by: Jiang Li <jiang1.li@intel.com>
Co-authored-by: Abhilash Majumder <abhilash.majumder@intel.com>
Co-authored-by: Abhilash Majumder <30946547+abhilash1910@users.noreply.github.com>
2024-06-17 11:01:25 -07:00
Cyrus Leung
03dccc886e
[Misc] Add vLLM version getter to utils ( #5098 )
2024-06-13 11:21:39 -07:00
Kevin H. Luu
916d219d62
[ci] Use sccache to build images ( #5419 )
...
Signed-off-by: kevin <kevin@anyscale.com>
2024-06-12 17:58:12 -07:00
Woosuk Kwon
1a8bfd92d5
[Hardware] Initial TPU integration ( #5292 )
2024-06-12 11:53:03 -07:00
Woosuk Kwon
8bab4959be
[Misc] Remove VLLM_BUILD_WITH_NEURON env variable ( #5389 )
2024-06-11 00:37:56 -07:00
bnellnm
5467ac3196
[Kernel][Misc] Use TORCH_LIBRARY instead of PYBIND11_MODULE for custom ops ( #5047 )
2024-06-09 16:23:30 -04:00
Divakar Verma
a66cf40b20
[Kernel][ROCm][AMD] enable fused topk_softmax kernel for moe layer ( #4927 )
...
This PR enables the fused topk_softmax kernel used in moe layer for HIP
2024-06-02 14:13:26 -07:00
Daniele
a360ff80bb
[CI/Build] CMakeLists: build all extensions' cmake targets at the same time ( #5034 )
2024-05-31 22:06:45 -06:00
youkaichao
5bd3c65072
[Core][Optimization] remove vllm-nccl ( #5091 )
2024-05-29 05:13:52 +00:00
Sanger Steel
8bc68e198c
[Frontend] [Core] perf: Automatically detect vLLM-tensorized model, update tensorizer
to version 2.9.0 ( #4208 )
2024-05-13 14:57:07 -07:00