20231088/vllm - vllm - Luminance Code Repo

20231088/vllm

Author	SHA1	Message	Date
youkaichao	845a3f26f9	[Doc] add debugging tips for crash and multi-node debugging (#5581 )	2024-06-17 10:08:01 +08:00
Li, Jiang	80aa7e91fc	[Hardware][Intel] Optimize CPU backend and add more performance tips (#4971 ) Co-authored-by: Jianan Gu <jianan.gu@intel.com>	2024-06-13 09:33:14 -07:00
Cyrus Leung	b8d4dfff9c	[Doc] Update debug docs (#5438 )	2024-06-12 14:49:31 -07:00
Woosuk Kwon	1a8bfd92d5	[Hardware] Initial TPU integration (#5292 )	2024-06-12 11:53:03 -07:00
youkaichao	8f89d72090	[Doc] add common case for long waiting time (#5430 )	2024-06-11 11:12:13 -07:00
youkaichao	d8f31f2f8b	[Doc] add debugging tips (#5409 )	2024-06-10 23:21:43 -07:00
Jie Fu (傅杰)	87d5abef75	[Bugfix] Fix a bug caused by pip install setuptools>=49.4.0 for CPU backend (#5249 )	2024-06-04 09:57:51 -07:00
youkaichao	6a50f4cafa	[Doc] add ccache guide in doc (#5012 ) Co-authored-by: Michael Goin <michael@neuralmagic.com>	2024-05-23 23:21:54 +00:00
fuchen.ljl	ee37328da0	Unable to find Punica extension issue during source code installation (#4494 ) Co-authored-by: Simon Mo <simon.mo@hey.com>	2024-05-01 00:42:09 +00:00
Hongxia Yang	cf29b7eda4	[ROCm][Hardware][AMD][Doc] Documentation update for ROCm (#4376 ) Co-authored-by: WoosukKwon <woosuk.kwon@berkeley.edu>	2024-04-25 18:12:25 -07:00
Harry Mellor	3d925165f2	Add example scripts to documentation (#4225 ) Co-authored-by: Harry Mellor <hmellor@oxts.com>	2024-04-22 16:36:54 +00:00
youkaichao	f3d0bf7589	[Doc][Installation] delete python setup.py develop (#3989 )	2024-04-11 03:33:02 +00:00
bigPYJ1151	0e3f06fe9c	[Hardware][Intel] Add CPU inference backend (#3634 ) Co-authored-by: Kunshang Ji <kunshang.ji@intel.com> Co-authored-by: Yuan Zhou <yuan.zhou@intel.com>	2024-04-01 22:07:30 -07:00
youkaichao	9c82a1bec3	[Doc] Update installation doc (#3746 ) [Doc] Update installation doc for build from source and explain the dependency on torch/cuda version (#3746) Co-authored-by: Zhuohan Li <zhuohan123@gmail.com>	2024-03-30 16:34:38 -07:00
youkaichao	42bc386129	[CI/Build] respect the common environment variable MAX_JOBS (#3600 )	2024-03-24 17:04:00 -07:00
Jim Burtoft	63e8b28a99	[Doc] minor fix of spelling in amd-installation.rst (#3506 )	2024-03-19 20:32:30 +00:00
Jim Burtoft	2a60c9bd17	[Doc] minor fix to neuron-installation.rst (#3505 )	2024-03-19 13:21:35 -07:00
Liangfu Chen	d0fae88114	[DOC] add setup document to support neuron backend (#2777 )	2024-03-04 01:03:51 +00:00
Hongxia Yang	0580aab02f	[ROCm] support Radeon™ 7900 series (gfx1100) without using flash-attention (#2768 )	2024-02-10 23:14:37 -08:00
Philipp Moritz	931746bc6d	Add documentation on how to do incremental builds (#2796 )	2024-02-07 14:42:02 -08:00
Hongxia Yang	6b7de1a030	[ROCm] add support to ROCm 6.0 and MI300 (#2274 )	2024-01-26 12:41:10 -08:00
Erfan Al-Hossami	9c1352eb57	[Feature] Simple API token authentication and pluggable middlewares (#1106 )	2024-01-23 15:13:00 -08:00
Simon	827cbcd37c	Update quickstart.rst (#2369 )	2024-01-12 12:56:18 -08:00
Zhuohan Li	f745847ef7	[Minor] Fix the format in quick start guide related to Model Scope (#2425 )	2024-01-11 19:44:01 -08:00
Shivam Thakkar	1db83e31a2	[Docs] Update installation instructions to include CUDA 11.8 xFormers (#2246 )	2023-12-22 23:20:02 -08:00
kliuae	1b7c791d60	[ROCm] Fixes for GPTQ on ROCm (#2180 )	2023-12-18 10:41:04 -08:00
Woosuk Kwon	6565d9e33e	Update installation instruction for vLLM + CUDA 11.8 (#2086 )	2023-12-13 09:25:59 -08:00
TJian	f375ec8440	[ROCm] Upgrade xformers version for ROCm & update doc (#2079 ) Co-authored-by: miloice <jeffaw99@hotmail.com>	2023-12-13 00:56:05 -08:00
TJian	6ccc0bfffb	Merge EmbeddedLLM/vllm-rocm into vLLM main (#1836 ) Co-authored-by: Philipp Moritz <pcmoritz@gmail.com> Co-authored-by: Amir Balwel <amoooori04@gmail.com> Co-authored-by: root <kuanfu.liu@akirakan.com> Co-authored-by: tjtanaa <tunjian.tan@embeddedllm.com> Co-authored-by: kuanfu <kuanfu.liu@embeddedllm.com> Co-authored-by: miloice <17350011+kliuae@users.noreply.github.com>	2023-12-07 23:16:52 -08:00
gottlike	42c02f5892	Fix quickstart.rst typo jinja (#1964 )	2023-12-07 08:34:44 -08:00
Massimiliano Pronesti	c07a442854	chore(examples-docs): upgrade to OpenAI V1 (#1785 )	2023-12-03 01:11:22 -08:00
Adam Brusselback	66785cc05c	Support chat template and `echo` for chat API (#1756 )	2023-11-30 16:43:13 -08:00
Woosuk Kwon	06e9ebebd5	Add instructions to install vLLM+cu118 (#1717 )	2023-11-18 23:48:58 -08:00
liuyhwangyh	edb305584b	Support download models from www.modelscope.cn (#1588 )	2023-11-17 20:38:31 -08:00
Nick Perez	4ee52bb169	Docs: Fix broken link to openai example (#1145 ) Link to `openai_client.py` is no longer valid - updated to `openai_completion_client.py`	2023-09-22 11:36:09 -07:00
Woosuk Kwon	7d7e3b78a3	Use `--ipc=host` in docker run for distributed inference (#1125 )	2023-09-21 18:26:47 -07:00
Woosuk Kwon	b9cecc2635	[Docs] Update installation page (#1005 )	2023-09-10 14:23:31 -07:00
Woosuk Kwon	b7e62d3454	Fix repo & documentation URLs (#163 )	2023-06-19 20:03:40 -07:00
Woosuk Kwon	dcda03b4cb	Write README and front page of doc (#147 )	2023-06-18 03:19:38 -07:00
Zhuohan Li	bec7b2dc26	Add quickstart guide (#148 )	2023-06-18 01:26:12 +08:00
Woosuk Kwon	0b98ba15c7	Change the name to vLLM (#150 )	2023-06-17 03:07:40 -07:00
Woosuk Kwon	e38074b1e6	Support FP32 (#141 )	2023-06-07 00:40:21 -07:00
Woosuk Kwon	376725ce74	[PyPI] Packaging for PyPI distribution (#140 )	2023-06-05 20:03:14 -07:00
Woosuk Kwon	56b7f0efa4	Add a doc for installation (#128 )	2023-05-27 01:13:06 -07:00
Woosuk Kwon	19d2899439	Add initial sphinx docs (#120 )	2023-05-22 17:02:44 -07:00

1 2 3 4

195 Commits