20231088/vllm - vllm - Luminance Code Repo

20231088/vllm

Author	SHA1	Message	Date
TJian	6ccc0bfffb	Merge EmbeddedLLM/vllm-rocm into vLLM main (#1836 ) Co-authored-by: Philipp Moritz <pcmoritz@gmail.com> Co-authored-by: Amir Balwel <amoooori04@gmail.com> Co-authored-by: root <kuanfu.liu@akirakan.com> Co-authored-by: tjtanaa <tunjian.tan@embeddedllm.com> Co-authored-by: kuanfu <kuanfu.liu@embeddedllm.com> Co-authored-by: miloice <17350011+kliuae@users.noreply.github.com>	2023-12-07 23:16:52 -08:00
AguirreNicolas	24f60a54f4	[Docker] Adding number of nvcc_threads during build as envar (#1893 )	2023-12-07 11:00:32 -08:00
gottlike	42c02f5892	Fix quickstart.rst typo jinja (#1964 )	2023-12-07 08:34:44 -08:00
Peter Götz	d940ce497e	Fix typo in adding_model.rst (#1947 ) adpated -> adapted	2023-12-06 10:04:26 -08:00
Massimiliano Pronesti	c07a442854	chore(examples-docs): upgrade to OpenAI V1 (#1785 )	2023-12-03 01:11:22 -08:00
Simon Mo	5313c2cb8b	Add Production Metrics in Prometheus format (#1890 )	2023-12-02 16:37:44 -08:00
Simon Mo	4cefa9b49b	[Docs] Update the AWQ documentation to highlight performance issue (#1883 )	2023-12-02 15:52:47 -08:00
Woosuk Kwon	e5452ddfd6	Normalize head weights for Baichuan 2 (#1876 )	2023-11-30 20:03:58 -08:00
Adam Brusselback	66785cc05c	Support chat template and `echo` for chat API (#1756 )	2023-11-30 16:43:13 -08:00
Massimiliano Pronesti	05a38612b0	docs: add instruction for langchain (#1162 )	2023-11-30 10:57:44 -08:00
Simon Mo	0f621c2c7d	[Docs] Add information about using shared memory in docker (#1845 )	2023-11-29 18:33:56 -08:00
Casper	a921d8be9d	[DOCS] Add engine args documentation (#1741 )	2023-11-22 12:31:27 -08:00
Wen Sun	112627e8b2	[Docs] Fix the code block's format in deploying_with_docker page (#1722 )	2023-11-20 01:22:39 -08:00
Simon Mo	37c1e3c218	Documentation about official docker image (#1709 )	2023-11-19 20:56:26 -08:00
Woosuk Kwon	06e9ebebd5	Add instructions to install vLLM+cu118 (#1717 )	2023-11-18 23:48:58 -08:00
liuyhwangyh	edb305584b	Support download models from www.modelscope.cn (#1588 )	2023-11-17 20:38:31 -08:00
Zhuohan Li	0fc280b06c	Update the adding-model doc according to the new refactor (#1692 )	2023-11-16 18:46:26 -08:00
Zhuohan Li	415d109527	[Fix] Update Supported Models List (#1690 )	2023-11-16 14:47:26 -08:00
Casper	8516999495	Add Quantization and AutoAWQ to docs (#1235 )	2023-11-04 22:43:39 -07:00
Stephen Krider	9cabcb7645	Add Dockerfile (#1350 )	2023-10-31 12:36:47 -07:00
Zhuohan Li	9eed4d1f3e	Update README.md (#1292 )	2023-10-08 23:15:50 -07:00
Usama Ahmed	0967102c6d	fixing typo in `tiiuae/falcon-rw-7b` model name (#1226 )	2023-09-29 13:40:25 -07:00
Woosuk Kwon	202351d5bf	Add Mistral to supported model list (#1221 )	2023-09-28 14:33:04 -07:00
Nick Perez	4ee52bb169	Docs: Fix broken link to openai example (#1145 ) Link to `openai_client.py` is no longer valid - updated to `openai_completion_client.py`	2023-09-22 11:36:09 -07:00
Woosuk Kwon	7d7e3b78a3	Use `--ipc=host` in docker run for distributed inference (#1125 )	2023-09-21 18:26:47 -07:00
Tanmay Verma	6f2dd6c37e	Add documentation to Triton server tutorial (#983 )	2023-09-20 10:32:40 -07:00
Woosuk Kwon	eda1a7cad3	Announce paper release (#1036 )	2023-09-13 17:38:13 -07:00
Woosuk Kwon	b9cecc2635	[Docs] Update installation page (#1005 )	2023-09-10 14:23:31 -07:00
Zhuohan Li	002800f081	Align vLLM's beam search implementation with HF generate (#857 )	2023-09-04 17:29:42 -07:00
Woosuk Kwon	55b28b1eee	[Docs] Minor fixes in supported models (#920 ) * Minor fix in supported models * Add another small fix for Aquila model --------- Co-authored-by: Zhuohan Li <zhuohan123@gmail.com>	2023-08-31 16:28:39 -07:00
Zhuohan Li	14f9c72bfd	Update Supported Model List (#825 )	2023-08-22 11:51:44 -07:00
Uranus	1b151ed181	Fix baichuan doc style (#748 )	2023-08-13 20:57:31 -07:00
Zhuohan Li	f7389f4763	[Doc] Add Baichuan 13B to supported models (#656 )	2023-08-02 16:45:12 -07:00
Zhuohan Li	1b0bd0fe8a	Add Falcon support (new) (#592 )	2023-08-02 14:04:39 -07:00
Zhuohan Li	df5dd3c68e	Add Baichuan-7B to README (#494 )	2023-07-25 15:25:12 -07:00
Zhuohan Li	6fc2a38b11	Add support for LLaMA-2 (#505 )	2023-07-20 11:38:27 -07:00
Zhanghao Wu	58df2883cb	[Doc] Add doc for running vLLM on the cloud (#426 ) Co-authored-by: Zhuohan Li <zhuohan123@gmail.com>	2023-07-16 13:37:14 -07:00
Andre Slavescu	c894836108	[Model] Add support for GPT-J (#226 ) Co-authored-by: woWoosuk Kwon <woosuk.kwon@berkeley.edu>	2023-07-08 17:55:16 -07:00
Woosuk Kwon	ffa6d2f9f9	[Docs] Fix typo (#346 )	2023-07-03 16:51:47 -07:00
Woosuk Kwon	404422f42e	[Model] Add support for MPT (#334 )	2023-07-03 16:47:53 -07:00
Woosuk Kwon	e41f06702c	Add support for BLOOM (#331 )	2023-07-03 13:12:35 -07:00
Zhuohan Li	2cf1a333b6	[Doc] Documentation for distributed inference (#261 )	2023-06-26 11:34:23 -07:00
Woosuk Kwon	665c48963b	[Docs] Add GPTBigCode to supported models (#213 )	2023-06-22 15:05:11 -07:00
Woosuk Kwon	794e578de0	[Minor] Fix URLs (#166 )	2023-06-19 22:57:14 -07:00
Woosuk Kwon	caddfc14c1	[Minor] Fix icons in doc (#165 )	2023-06-19 20:35:38 -07:00
Woosuk Kwon	b7e62d3454	Fix repo & documentation URLs (#163 )	2023-06-19 20:03:40 -07:00
Woosuk Kwon	364536acd1	[Docs] Minor fix (#162 )	2023-06-19 19:58:23 -07:00
Zhuohan Li	0b32a987dd	Add and list supported models in README (#161 )	2023-06-20 10:57:46 +08:00
Zhuohan Li	a255885f83	Add logo and polish readme (#156 )	2023-06-19 16:31:13 +08:00
Woosuk Kwon	dcda03b4cb	Write README and front page of doc (#147 )	2023-06-18 03:19:38 -07:00

1 2 3

108 Commits