TJian
|
6ccc0bfffb
|
Merge EmbeddedLLM/vllm-rocm into vLLM main (#1836)
Co-authored-by: Philipp Moritz <pcmoritz@gmail.com>
Co-authored-by: Amir Balwel <amoooori04@gmail.com>
Co-authored-by: root <kuanfu.liu@akirakan.com>
Co-authored-by: tjtanaa <tunjian.tan@embeddedllm.com>
Co-authored-by: kuanfu <kuanfu.liu@embeddedllm.com>
Co-authored-by: miloice <17350011+kliuae@users.noreply.github.com>
|
2023-12-07 23:16:52 -08:00 |
|
AguirreNicolas
|
24f60a54f4
|
[Docker] Adding number of nvcc_threads during build as envar (#1893)
|
2023-12-07 11:00:32 -08:00 |
|
gottlike
|
42c02f5892
|
Fix quickstart.rst typo jinja (#1964)
|
2023-12-07 08:34:44 -08:00 |
|
Peter Götz
|
d940ce497e
|
Fix typo in adding_model.rst (#1947)
adpated -> adapted
|
2023-12-06 10:04:26 -08:00 |
|
Massimiliano Pronesti
|
c07a442854
|
chore(examples-docs): upgrade to OpenAI V1 (#1785)
|
2023-12-03 01:11:22 -08:00 |
|
Simon Mo
|
5313c2cb8b
|
Add Production Metrics in Prometheus format (#1890)
|
2023-12-02 16:37:44 -08:00 |
|
Simon Mo
|
4cefa9b49b
|
[Docs] Update the AWQ documentation to highlight performance issue (#1883)
|
2023-12-02 15:52:47 -08:00 |
|
Woosuk Kwon
|
e5452ddfd6
|
Normalize head weights for Baichuan 2 (#1876)
|
2023-11-30 20:03:58 -08:00 |
|
Adam Brusselback
|
66785cc05c
|
Support chat template and echo for chat API (#1756)
|
2023-11-30 16:43:13 -08:00 |
|
Massimiliano Pronesti
|
05a38612b0
|
docs: add instruction for langchain (#1162)
|
2023-11-30 10:57:44 -08:00 |
|
Simon Mo
|
0f621c2c7d
|
[Docs] Add information about using shared memory in docker (#1845)
|
2023-11-29 18:33:56 -08:00 |
|
Casper
|
a921d8be9d
|
[DOCS] Add engine args documentation (#1741)
|
2023-11-22 12:31:27 -08:00 |
|
Wen Sun
|
112627e8b2
|
[Docs] Fix the code block's format in deploying_with_docker page (#1722)
|
2023-11-20 01:22:39 -08:00 |
|
Simon Mo
|
37c1e3c218
|
Documentation about official docker image (#1709)
|
2023-11-19 20:56:26 -08:00 |
|
Woosuk Kwon
|
06e9ebebd5
|
Add instructions to install vLLM+cu118 (#1717)
|
2023-11-18 23:48:58 -08:00 |
|
liuyhwangyh
|
edb305584b
|
Support download models from www.modelscope.cn (#1588)
|
2023-11-17 20:38:31 -08:00 |
|
Zhuohan Li
|
0fc280b06c
|
Update the adding-model doc according to the new refactor (#1692)
|
2023-11-16 18:46:26 -08:00 |
|
Zhuohan Li
|
415d109527
|
[Fix] Update Supported Models List (#1690)
|
2023-11-16 14:47:26 -08:00 |
|
Casper
|
8516999495
|
Add Quantization and AutoAWQ to docs (#1235)
|
2023-11-04 22:43:39 -07:00 |
|
Stephen Krider
|
9cabcb7645
|
Add Dockerfile (#1350)
|
2023-10-31 12:36:47 -07:00 |
|
Zhuohan Li
|
9eed4d1f3e
|
Update README.md (#1292)
|
2023-10-08 23:15:50 -07:00 |
|
Usama Ahmed
|
0967102c6d
|
fixing typo in tiiuae/falcon-rw-7b model name (#1226)
|
2023-09-29 13:40:25 -07:00 |
|
Woosuk Kwon
|
202351d5bf
|
Add Mistral to supported model list (#1221)
|
2023-09-28 14:33:04 -07:00 |
|
Nick Perez
|
4ee52bb169
|
Docs: Fix broken link to openai example (#1145)
Link to `openai_client.py` is no longer valid - updated to `openai_completion_client.py`
|
2023-09-22 11:36:09 -07:00 |
|
Woosuk Kwon
|
7d7e3b78a3
|
Use --ipc=host in docker run for distributed inference (#1125)
|
2023-09-21 18:26:47 -07:00 |
|
Tanmay Verma
|
6f2dd6c37e
|
Add documentation to Triton server tutorial (#983)
|
2023-09-20 10:32:40 -07:00 |
|
Woosuk Kwon
|
eda1a7cad3
|
Announce paper release (#1036)
|
2023-09-13 17:38:13 -07:00 |
|
Woosuk Kwon
|
b9cecc2635
|
[Docs] Update installation page (#1005)
|
2023-09-10 14:23:31 -07:00 |
|
Zhuohan Li
|
002800f081
|
Align vLLM's beam search implementation with HF generate (#857)
|
2023-09-04 17:29:42 -07:00 |
|
Woosuk Kwon
|
55b28b1eee
|
[Docs] Minor fixes in supported models (#920)
* Minor fix in supported models
* Add another small fix for Aquila model
---------
Co-authored-by: Zhuohan Li <zhuohan123@gmail.com>
|
2023-08-31 16:28:39 -07:00 |
|
Zhuohan Li
|
14f9c72bfd
|
Update Supported Model List (#825)
|
2023-08-22 11:51:44 -07:00 |
|
Uranus
|
1b151ed181
|
Fix baichuan doc style (#748)
|
2023-08-13 20:57:31 -07:00 |
|
Zhuohan Li
|
f7389f4763
|
[Doc] Add Baichuan 13B to supported models (#656)
|
2023-08-02 16:45:12 -07:00 |
|
Zhuohan Li
|
1b0bd0fe8a
|
Add Falcon support (new) (#592)
|
2023-08-02 14:04:39 -07:00 |
|
Zhuohan Li
|
df5dd3c68e
|
Add Baichuan-7B to README (#494)
|
2023-07-25 15:25:12 -07:00 |
|
Zhuohan Li
|
6fc2a38b11
|
Add support for LLaMA-2 (#505)
|
2023-07-20 11:38:27 -07:00 |
|
Zhanghao Wu
|
58df2883cb
|
[Doc] Add doc for running vLLM on the cloud (#426)
Co-authored-by: Zhuohan Li <zhuohan123@gmail.com>
|
2023-07-16 13:37:14 -07:00 |
|
Andre Slavescu
|
c894836108
|
[Model] Add support for GPT-J (#226)
Co-authored-by: woWoosuk Kwon <woosuk.kwon@berkeley.edu>
|
2023-07-08 17:55:16 -07:00 |
|
Woosuk Kwon
|
ffa6d2f9f9
|
[Docs] Fix typo (#346)
|
2023-07-03 16:51:47 -07:00 |
|
Woosuk Kwon
|
404422f42e
|
[Model] Add support for MPT (#334)
|
2023-07-03 16:47:53 -07:00 |
|
Woosuk Kwon
|
e41f06702c
|
Add support for BLOOM (#331)
|
2023-07-03 13:12:35 -07:00 |
|
Zhuohan Li
|
2cf1a333b6
|
[Doc] Documentation for distributed inference (#261)
|
2023-06-26 11:34:23 -07:00 |
|
Woosuk Kwon
|
665c48963b
|
[Docs] Add GPTBigCode to supported models (#213)
|
2023-06-22 15:05:11 -07:00 |
|
Woosuk Kwon
|
794e578de0
|
[Minor] Fix URLs (#166)
|
2023-06-19 22:57:14 -07:00 |
|
Woosuk Kwon
|
caddfc14c1
|
[Minor] Fix icons in doc (#165)
|
2023-06-19 20:35:38 -07:00 |
|
Woosuk Kwon
|
b7e62d3454
|
Fix repo & documentation URLs (#163)
|
2023-06-19 20:03:40 -07:00 |
|
Woosuk Kwon
|
364536acd1
|
[Docs] Minor fix (#162)
|
2023-06-19 19:58:23 -07:00 |
|
Zhuohan Li
|
0b32a987dd
|
Add and list supported models in README (#161)
|
2023-06-20 10:57:46 +08:00 |
|
Zhuohan Li
|
a255885f83
|
Add logo and polish readme (#156)
|
2023-06-19 16:31:13 +08:00 |
|
Woosuk Kwon
|
dcda03b4cb
|
Write README and front page of doc (#147)
|
2023-06-18 03:19:38 -07:00 |
|