99 Commits

Author SHA1 Message Date
Woosuk Kwon
d80aef3776
[Docs] Clean up latest news (#6401) 2024-07-12 19:36:53 -07:00
Saliya Ekanayake
a27f87da34
[Doc] Fix Typo in Doc (#6392)
Co-authored-by: Saliya Ekanayake <esaliya@d-matrix.ai>
2024-07-13 00:48:23 +00:00
Kuntai Du
a4feba929b
[CI/Build] Add nightly benchmarking for tgi, tensorrt-llm and lmdeploy (#5362) 2024-07-11 13:28:38 -07:00
youkaichao
2d23b42d92
[doc] update pipeline parallel in readme (#6347) 2024-07-11 11:38:40 -07:00
Jie Fu (傅杰)
439c84581a
[Doc] Update description of vLLM support for CPUs (#6003) 2024-07-10 21:15:29 -07:00
Kunshang Ji
cf90ae0123
[CI][Hardware][Intel GPU] add Intel GPU(XPU) ci pipeline (#5616) 2024-06-21 17:09:34 -07:00
Simon Mo
cdab68dcdb
[Docs] Add ZhenFund as a Sponsor (#5548) 2024-06-14 11:17:21 -07:00
Woosuk Kwon
a65634d3ae
[Docs] Add 4th meetup slides (#5509) 2024-06-13 10:18:26 -07:00
Li, Jiang
80aa7e91fc
[Hardware][Intel] Optimize CPU backend and add more performance tips (#4971)
Co-authored-by: Jianan Gu <jianan.gu@intel.com>
2024-06-13 09:33:14 -07:00
Woosuk Kwon
cb77ad836f
[Docs] Alphabetically sort sponsors (#5386) 2024-06-10 15:17:19 -05:00
Simon Mo
8f1729b829
[Docs] Add Ray Summit CFP (#5295) 2024-06-05 15:25:18 -07:00
Simon Mo
f270a39537
[Docs] Add Sequoia as sponsors (#5287) 2024-06-05 18:02:56 +00:00
Simon Mo
290f4ada2b
[Docs] Add Dropbox as sponsors (#5089) 2024-05-28 10:29:09 -07:00
Simon Mo
e941f88584
[Docs] Add acknowledgment for sponsors (#4925) 2024-05-21 00:17:25 -07:00
Zhuohan Li
361c461a12
[Doc] Highlight the fourth meetup in the README (#4842) 2024-05-15 11:38:49 -07:00
Simon Mo
29bc01bf3b
Add 4th meetup announcement to readme (#4817) 2024-05-14 18:33:06 -04:00
Zhuohan Li
ac1fbf7fd2
[Doc] Shorten README by removing supported model list (#4796) 2024-05-13 16:23:54 -07:00
Caio Mendes
bd7a8eef25
[Doc] README Phi-3 name fix. (#4372)
Co-authored-by: Caio Mendes <caiocesart@microsoft.com>
2024-04-25 10:32:00 -07:00
Isotr0py
fbf152d976
[Bugfix][Model] Refactor OLMo model to support new HF format in transformers 4.40.0 (#4324)
Co-authored-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
2024-04-25 09:35:56 -07:00
Caio Mendes
96e90fdeb3
[Model] Adds Phi-3 support (#4298) 2024-04-25 03:06:57 +00:00
Simon Mo
705578ae14
[Docs] document that Meta Llama 3 is supported (#4175) 2024-04-18 10:55:48 -07:00
Simon Mo
aceb17cf2d
[Docs] document that mixtral 8x22b is supported (#4073) 2024-04-14 14:35:55 -07:00
ywfang
b4543c8f6b
[Model] add minicpm (#3893) 2024-04-08 18:28:36 +08:00
Woosuk Kwon
b95047f2da
[Misc] Publish 3rd meetup slides (#3835) 2024-04-03 15:46:10 -07:00
Robert Shaw
76b889bf1d
[Doc] Update README.md (#3806) 2024-04-02 23:11:10 -07:00
wenyujin333
d6ea427f04
[Model] Add support for Qwen2MoeModel (#3346) 2024-03-28 15:19:59 +00:00
hxer7963
098e1776ba
[Model] Add support for xverse (#3610)
Co-authored-by: willhe <hexin@xverse.cn>
Co-authored-by: root <root@localhost.localdomain>
2024-03-27 18:12:54 -07:00
Woosuk Kwon
6d9aa00fc4
[Docs] Add Command-R to supported models (#3669) 2024-03-27 15:20:00 -07:00
Megha Agarwal
e24336b5a7
[Model] Add support for DBRX (#3660) 2024-03-27 13:01:46 -07:00
Lalit Pradhan
4c07dd28c0
[🚀 Ready to be merged] Added support for Jais models (#3183) 2024-03-21 09:45:24 +00:00
Zhuohan Li
b30880a762
[Misc] Update README for the Third vLLM Meetup (#3479) 2024-03-18 15:58:38 -07:00
Seonghyeon
bfdcfa6a05
Support starcoder2 architecture (#3089) 2024-02-29 00:51:48 -08:00
张大成
48a8f4a7fd
Support Orion model (#2539)
Co-authored-by: zhangdacheng <zhangdacheng@ainirobot.com>
Co-authored-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
2024-02-26 19:17:06 -08:00
Zhuohan Li
a9c8212895
[FIX] Add Gemma model to the doc (#2966) 2024-02-21 09:46:15 -08:00
Isotr0py
ab3a5a8259
Support OLMo models. (#2832) 2024-02-18 21:05:15 -08:00
Simon Mo
bb8c697ee0
Update README for meetup slides (#2718) 2024-02-01 14:56:53 -08:00
Fengzhe Zhou
cd9e60c76c
Add Internlm2 (#2666) 2024-02-01 09:27:40 -08:00
Zhuohan Li
1af090b57d
Bump up version to v0.3.0 (#2656) 2024-01-31 00:07:07 -08:00
Hongxia Yang
6b7de1a030
[ROCm] add support to ROCm 6.0 and MI300 (#2274) 2024-01-26 12:41:10 -08:00
Junyang Lin
94b5edeb53
Add qwen2 (#2495) 2024-01-22 14:34:21 -08:00
Hyunsung Lee
e1957c6ebd
Add StableLM3B model (#2372) 2024-01-16 20:32:40 -08:00
Woosuk Kwon
2a18da257c
Announce the second vLLM meetup (#2444) 2024-01-15 14:11:59 -08:00
blueceiling
face83c7ec
[Docs] Add "About" Heading to README.md (#2260) 2023-12-25 16:37:07 -08:00
avideci
de60a3fb93
Added DeciLM-7b and DeciLM-7b-instruct (#2062) 2023-12-19 02:29:33 -08:00
Woosuk Kwon
f8c688d746
[Minor] Add Phi 2 to supported models (#2159) 2023-12-17 02:54:57 -08:00
Woosuk Kwon
26c52a5ea6
[Docs] Add CUDA graph support to docs (#2148) 2023-12-17 01:49:20 -08:00
Woosuk Kwon
b81a6a6bb3
[Docs] Add supported quantization methods to docs (#2135) 2023-12-15 13:29:22 -08:00
Antoni Baum
21d93c140d
Optimize Mixtral with expert parallelism (#2090) 2023-12-13 23:55:07 -08:00
Woosuk Kwon
31d2ab4aff
Remove python 3.10 requirement (#2040) 2023-12-11 12:26:42 -08:00
Ram
2eaa81b236
Update README.md to add megablocks requirement for mixtral (#2033) 2023-12-11 11:37:34 -08:00