171 Commits

Author SHA1 Message Date
Michael Goin
b482b9a5b1
[CI/Build] Add support for Python 3.12 (#7035) 2024-08-02 13:51:22 -07:00
Jee Jee Li
7ecee34321
[Kernel][RFC] Refactor the punica kernel based on Triton (#5036) 2024-07-31 17:12:24 -07:00
Ilya Lavrenov
5895b24677
[OpenVINO] Updated OpenVINO requirements and build docs (#6948) 2024-07-30 11:33:01 -07:00
Woosuk Kwon
fad5576c58
[TPU] Reduce compilation time & Upgrade PyTorch XLA version (#6856) 2024-07-27 10:28:33 -07:00
omrishiv
3c3012398e
[Doc] add VLLM_TARGET_DEVICE=neuron to documentation for neuron (#6844)
Signed-off-by: omrishiv <327609+omrishiv@users.noreply.github.com>
2024-07-26 20:20:16 -07:00
Woosuk Kwon
ced36cd89b
[ROCm] Upgrade PyTorch nightly version (#6845) 2024-07-26 20:16:13 -07:00
Li, Jiang
3bbb4936dc
[Hardware] [Intel] Enable Multiprocessing and tensor parallel in CPU backend and update documentation (#6125) 2024-07-26 13:50:10 -07:00
youkaichao
85ad7e2d01
[doc][debugging] add known issues for hangs (#6816) 2024-07-25 21:48:05 -07:00
Hongxia Yang
d88c458f44
[Doc][AMD][ROCm]Added tips to refer to mi300x tuning guide for mi300x users (#6754) 2024-07-24 14:32:57 -07:00
Woosuk Kwon
ccc4a73257
[Docs][ROCm] Detailed instructions to build from source (#6680) 2024-07-24 01:07:23 -07:00
Matt Wong
06d6c5fe9f
[Bugfix][CI/Build][Hardware][AMD] Fix AMD tests, add HF cache, update CK FA, add partially supported model notes (#6543) 2024-07-20 09:39:07 -07:00
Simon Mo
30efe41532
[Docs] Update docs for wheel location (#6580) 2024-07-19 12:14:11 -07:00
Cyrus Leung
5bf35a91e4
[Doc][CI/Build] Update docs and tests to use vllm serve (#6431) 2024-07-17 07:43:21 +00:00
Hongxia Yang
10383887e0
[ROCm] Cleanup Dockerfile and remove outdated patch (#6482) 2024-07-16 22:47:02 -07:00
Woosuk Kwon
c467dff24f
[Hardware][TPU] Support MoE with Pallas GMM kernel (#6457) 2024-07-16 09:56:28 -07:00
youkaichao
9f4ccec761
[doc][misc] remind to cancel debugging environment variables (#6481)
[doc][misc] remind users to cancel debugging environment variables after debugging (#6481)
2024-07-16 09:45:30 -07:00
youkaichao
22e79ee8f3
[doc][misc] doc update (#6439) 2024-07-14 23:33:25 -07:00
Robert Cohn
61e85dbad8
[Doc] xpu backend requires running setvars.sh (#6393) 2024-07-14 17:10:11 -07:00
Simon Mo
d719ba24c5
Build some nightly wheels by default (#6380) 2024-07-12 13:56:59 -07:00
Jie Fu (傅杰)
439c84581a
[Doc] Update description of vLLM support for CPUs (#6003) 2024-07-10 21:15:29 -07:00
youkaichao
966fe72141
[doc][misc] bump up py version in installation doc (#6119) 2024-07-03 15:52:04 -07:00
Ilya Lavrenov
57f09a419c
[Hardware][Intel] OpenVINO vLLM backend (#5379) 2024-06-28 13:50:16 +00:00
Matt Wong
dd793d1de5
[Hardware][AMD][CI/Build][Doc] Upgrade to ROCm 6.1, Dockerfile improvements, test fixes (#5422) 2024-06-25 15:56:15 -07:00
youkaichao
c18ebfdd71
[doc][distributed] add both gloo and nccl tests (#5834) 2024-06-25 15:10:28 -04:00
Woosuk Kwon
8c00f9c15d
[Docs][TPU] Add installation tip for TPU (#5761) 2024-06-21 23:09:40 -07:00
Kunshang Ji
728c4c8a06
[Hardware][Intel GPU] Add Intel GPU(XPU) inference backend (#3814)
Co-authored-by: Jiang Li <jiang1.li@intel.com>
Co-authored-by: Abhilash Majumder <abhilash.majumder@intel.com>
Co-authored-by: Abhilash Majumder <30946547+abhilash1910@users.noreply.github.com>
2024-06-17 11:01:25 -07:00
youkaichao
845a3f26f9
[Doc] add debugging tips for crash and multi-node debugging (#5581) 2024-06-17 10:08:01 +08:00
Li, Jiang
80aa7e91fc
[Hardware][Intel] Optimize CPU backend and add more performance tips (#4971)
Co-authored-by: Jianan Gu <jianan.gu@intel.com>
2024-06-13 09:33:14 -07:00
Cyrus Leung
b8d4dfff9c
[Doc] Update debug docs (#5438) 2024-06-12 14:49:31 -07:00
Woosuk Kwon
1a8bfd92d5
[Hardware] Initial TPU integration (#5292) 2024-06-12 11:53:03 -07:00
youkaichao
8f89d72090
[Doc] add common case for long waiting time (#5430) 2024-06-11 11:12:13 -07:00
youkaichao
d8f31f2f8b
[Doc] add debugging tips (#5409) 2024-06-10 23:21:43 -07:00
Jie Fu (傅杰)
87d5abef75
[Bugfix] Fix a bug caused by pip install setuptools>=49.4.0 for CPU backend (#5249) 2024-06-04 09:57:51 -07:00
youkaichao
6a50f4cafa
[Doc] add ccache guide in doc (#5012)
Co-authored-by: Michael Goin <michael@neuralmagic.com>
2024-05-23 23:21:54 +00:00
fuchen.ljl
ee37328da0
Unable to find Punica extension issue during source code installation (#4494)
Co-authored-by: Simon Mo <simon.mo@hey.com>
2024-05-01 00:42:09 +00:00
Hongxia Yang
cf29b7eda4
[ROCm][Hardware][AMD][Doc] Documentation update for ROCm (#4376)
Co-authored-by: WoosukKwon <woosuk.kwon@berkeley.edu>
2024-04-25 18:12:25 -07:00
Harry Mellor
3d925165f2
Add example scripts to documentation (#4225)
Co-authored-by: Harry Mellor <hmellor@oxts.com>
2024-04-22 16:36:54 +00:00
youkaichao
f3d0bf7589
[Doc][Installation] delete python setup.py develop (#3989) 2024-04-11 03:33:02 +00:00
bigPYJ1151
0e3f06fe9c
[Hardware][Intel] Add CPU inference backend (#3634)
Co-authored-by: Kunshang Ji <kunshang.ji@intel.com>
Co-authored-by: Yuan Zhou <yuan.zhou@intel.com>
2024-04-01 22:07:30 -07:00
youkaichao
9c82a1bec3
[Doc] Update installation doc (#3746)
[Doc] Update installation doc for build from source and explain the dependency on torch/cuda version (#3746)
Co-authored-by: Zhuohan Li <zhuohan123@gmail.com>
2024-03-30 16:34:38 -07:00
youkaichao
42bc386129
[CI/Build] respect the common environment variable MAX_JOBS (#3600) 2024-03-24 17:04:00 -07:00
Jim Burtoft
63e8b28a99
[Doc] minor fix of spelling in amd-installation.rst (#3506) 2024-03-19 20:32:30 +00:00
Jim Burtoft
2a60c9bd17
[Doc] minor fix to neuron-installation.rst (#3505) 2024-03-19 13:21:35 -07:00
Liangfu Chen
d0fae88114
[DOC] add setup document to support neuron backend (#2777) 2024-03-04 01:03:51 +00:00
Hongxia Yang
0580aab02f
[ROCm] support Radeon™ 7900 series (gfx1100) without using flash-attention (#2768) 2024-02-10 23:14:37 -08:00
Philipp Moritz
931746bc6d
Add documentation on how to do incremental builds (#2796) 2024-02-07 14:42:02 -08:00
Hongxia Yang
6b7de1a030
[ROCm] add support to ROCm 6.0 and MI300 (#2274) 2024-01-26 12:41:10 -08:00
Erfan Al-Hossami
9c1352eb57
[Feature] Simple API token authentication and pluggable middlewares (#1106) 2024-01-23 15:13:00 -08:00
Simon
827cbcd37c
Update quickstart.rst (#2369) 2024-01-12 12:56:18 -08:00
Zhuohan Li
f745847ef7
[Minor] Fix the format in quick start guide related to Model Scope (#2425) 2024-01-11 19:44:01 -08:00