Yangshen⚡Deng
6a512a00df
[model] Support for Llava-Next-Video model ( #7559 )
...
Co-authored-by: Roger Wang <ywang@roblox.com>
Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
2024-09-10 22:21:36 -07:00
Daniele
6234385f4a
[CI/Build] enable ccache/scccache for HIP builds ( #8327 )
2024-09-10 08:55:08 -07:00
tomeras91
c02638efb3
[CI/Build] make pip install vllm work in macos (for import only) ( #8118 )
2024-09-03 12:37:08 -07:00
Roger Wang
5b86b19954
[Misc] Optional installation of audio related packages ( #8063 )
2024-09-01 14:46:57 -07:00
sasha0552
1b32e02648
[Bugfix] Pass PYTHONPATH from setup.py to CMake ( #7730 )
2024-08-21 11:17:48 -07:00
Kunshang Ji
1a36287b89
[Bugfix] Fix xpu build ( #7644 )
2024-08-18 22:00:09 -07:00
tomeras91
386087970a
[CI/Build] build on empty device for better dev experience ( #4773 )
2024-08-11 13:09:44 -07:00
Ilya Lavrenov
80cbe10c59
[OpenVINO] migrate to latest dependencies versions ( #7251 )
2024-08-07 09:49:10 -07:00
Lucas Wilkinson
a8d604ca2a
[Misc] Disambiguate quantized types via a new ScalarType ( #6396 )
2024-08-02 13:51:58 -07:00
Michael Goin
b482b9a5b1
[CI/Build] Add support for Python 3.12 ( #7035 )
2024-08-02 13:51:22 -07:00
Jee Jee Li
7ecee34321
[Kernel][RFC] Refactor the punica kernel based on Triton ( #5036 )
2024-07-31 17:12:24 -07:00
Ethan Xu
dbfe254eda
[Feature] vLLM CLI ( #5090 )
...
Co-authored-by: simon-mo <simon.mo@hey.com>
2024-07-14 15:36:43 -07:00
youkaichao
ccd3c04571
[ci][build] fix commit id ( #6420 )
...
Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>
2024-07-14 22:16:21 +08:00
Michael Goin
111fc6e7ec
[Misc] Add generated git commit hash as vllm.__commit__
( #6386 )
2024-07-12 22:52:15 +00:00
Ilya Lavrenov
57f09a419c
[Hardware][Intel] OpenVINO vLLM backend ( #5379 )
2024-06-28 13:50:16 +00:00
Kunshang Ji
728c4c8a06
[Hardware][Intel GPU] Add Intel GPU(XPU) inference backend ( #3814 )
...
Co-authored-by: Jiang Li <jiang1.li@intel.com>
Co-authored-by: Abhilash Majumder <abhilash.majumder@intel.com>
Co-authored-by: Abhilash Majumder <30946547+abhilash1910@users.noreply.github.com>
2024-06-17 11:01:25 -07:00
Cyrus Leung
03dccc886e
[Misc] Add vLLM version getter to utils ( #5098 )
2024-06-13 11:21:39 -07:00
Kevin H. Luu
916d219d62
[ci] Use sccache to build images ( #5419 )
...
Signed-off-by: kevin <kevin@anyscale.com>
2024-06-12 17:58:12 -07:00
Woosuk Kwon
1a8bfd92d5
[Hardware] Initial TPU integration ( #5292 )
2024-06-12 11:53:03 -07:00
Woosuk Kwon
8bab4959be
[Misc] Remove VLLM_BUILD_WITH_NEURON env variable ( #5389 )
2024-06-11 00:37:56 -07:00
bnellnm
5467ac3196
[Kernel][Misc] Use TORCH_LIBRARY instead of PYBIND11_MODULE for custom ops ( #5047 )
2024-06-09 16:23:30 -04:00
Divakar Verma
a66cf40b20
[Kernel][ROCm][AMD] enable fused topk_softmax kernel for moe layer ( #4927 )
...
This PR enables the fused topk_softmax kernel used in moe layer for HIP
2024-06-02 14:13:26 -07:00
Daniele
a360ff80bb
[CI/Build] CMakeLists: build all extensions' cmake targets at the same time ( #5034 )
2024-05-31 22:06:45 -06:00
youkaichao
5bd3c65072
[Core][Optimization] remove vllm-nccl ( #5091 )
2024-05-29 05:13:52 +00:00
Sanger Steel
8bc68e198c
[Frontend] [Core] perf: Automatically detect vLLM-tensorized model, update tensorizer
to version 2.9.0 ( #4208 )
2024-05-13 14:57:07 -07:00
kliuae
ff5abcd746
[ROCm] Add support for Punica kernels on AMD GPUs ( #3140 )
...
Co-authored-by: miloice <jeffaw99@hotmail.com>
2024-05-09 09:19:50 -07:00
Woosuk Kwon
89579a201f
[Misc] Use vllm-flash-attn instead of flash-attn ( #4686 )
2024-05-08 13:15:34 -07:00
youkaichao
344bf7cd2d
[Misc] add installation time env vars ( #4574 )
2024-05-03 15:55:56 -07:00
Hu Dong
5ad60b0cbd
[Misc] Exclude the tests
directory from being packaged ( #4552 )
2024-05-02 10:50:25 -07:00
Travis Johnson
8b798eec75
[CI/Build][Bugfix] VLLM_USE_PRECOMPILED should skip compilation ( #4534 )
...
Signed-off-by: Travis Johnson <tsjohnso@us.ibm.com>
2024-05-01 18:01:50 +00:00
Alpay Ariyak
715c2d854d
[Frontend] [Core] Tensorizer: support dynamic num_readers
, update version ( #4467 )
2024-04-30 16:32:13 -07:00
SangBin Cho
a88081bf76
[CI] Disable non-lazy string operation on logging ( #4326 )
...
Co-authored-by: Danny Guinther <dguinther@neuralmagic.com>
2024-04-26 00:16:58 -07:00
Liangfu Chen
cd2f63fb36
[CI/CD] add neuron docker and ci test scripts ( #3571 )
2024-04-18 15:26:01 -07:00
Nick Hill
563c54f760
[BugFix] Fix tensorizer extra in setup.py ( #4072 )
2024-04-14 14:12:42 -07:00
Sanger Steel
711a000255
[Frontend] [Core] feat: Add model loading using tensorizer
( #3476 )
2024-04-13 17:13:01 -07:00
Michael Feil
c2b4a1bce9
[Doc] Add typing hints / mypy types cleanup ( #3816 )
...
Co-authored-by: Roger Wang <136131678+ywang96@users.noreply.github.com>
2024-04-11 17:17:21 -07:00
Woosuk Kwon
cfaf49a167
[Misc] Define common requirements ( #3841 )
2024-04-05 00:39:17 -07:00
youkaichao
ca81ff5196
[Core] manage nccl via a pypi package & upgrade to pt 2.2.1 ( #3805 )
2024-04-04 10:26:19 -07:00
bigPYJ1151
0e3f06fe9c
[Hardware][Intel] Add CPU inference backend ( #3634 )
...
Co-authored-by: Kunshang Ji <kunshang.ji@intel.com>
Co-authored-by: Yuan Zhou <yuan.zhou@intel.com>
2024-04-01 22:07:30 -07:00
youkaichao
3492859b68
[CI/Build] update default number of jobs and nvcc threads to avoid overloading the system ( #3675 )
2024-03-28 00:18:54 -04:00
youkaichao
8f44facddd
[Core] remove cupy dependency ( #3625 )
2024-03-27 00:33:26 -07:00
SangBin Cho
01bfb22b41
[CI] Try introducing isort. ( #3495 )
2024-03-25 07:59:47 -07:00
youkaichao
42bc386129
[CI/Build] respect the common environment variable MAX_JOBS ( #3600 )
2024-03-24 17:04:00 -07:00
Zhuohan Li
523e30ea0c
[BugFix] Hot fix in setup.py for neuron build ( #3537 )
2024-03-20 17:59:52 -07:00
bnellnm
ba8ae1d84f
Check for _is_cuda() in compute_num_jobs ( #3481 )
2024-03-20 10:06:56 -07:00
bnellnm
9fdf3de346
Cmake based build system ( #2830 )
2024-03-18 15:38:33 -07:00
Woosuk Kwon
abfc4f3387
[Misc] Use dataclass for InputMetadata ( #3452 )
...
Co-authored-by: youkaichao <youkaichao@126.com>
2024-03-17 10:02:46 +00:00
Simon Mo
6b78837b29
Fix setup.py neuron-ls issue ( #2671 )
2024-03-16 16:00:25 -07:00
Simon Mo
8e67598aa6
[Misc] fix line length for entire codebase ( #3444 )
2024-03-16 00:36:29 -07:00
youkaichao
604f235937
[Misc] add error message in non linux platform ( #3438 )
2024-03-15 21:21:37 +00:00