20231088/vllm - vllm - Luminance Code Repo

20231088/vllm

Author	SHA1	Message	Date
Antoni Baum	acbed3ef40	Use monotonic time where appropriate (#1249 )	2023-10-02 19:22:05 -07:00
kg6-sleipnir	b5a10eb0ef	Added `dtype` arg to benchmarks (#1228 )	2023-09-30 21:04:03 -07:00
Woosuk Kwon	e3e79e9e8a	Implement AWQ quantization support for LLaMA (#1032 ) Co-authored-by: Robert Irvine <robert@seamlessml.com> Co-authored-by: root <rirv938@gmail.com> Co-authored-by: Casper <casperbh.96@gmail.com> Co-authored-by: julian-q <julianhquevedo@gmail.com>	2023-09-16 00:03:37 -07:00
Ricardo Lu	8c4b2592fb	fix: enable trust-remote-code in api server & benchmark. (#509 )	2023-07-19 17:06:15 -07:00
WRH	cf21a9bd5c	support trust_remote_code in benchmark (#518 )	2023-07-19 17:02:40 -07:00
Woosuk Kwon	4338cc4750	[Tokenizer] Add an option to specify tokenizer (#284 )	2023-06-28 09:46:58 -07:00
Woosuk Kwon	0b98ba15c7	Change the name to vLLM (#150 )	2023-06-17 03:07:40 -07:00
Zhuohan Li	e5464ee484	Rename servers to engines (#152 )	2023-06-17 17:25:21 +08:00
Woosuk Kwon	bab8f3dd0d	[Minor] Fix benchmark_throughput.py (#151 )	2023-06-16 21:00:52 -07:00
Woosuk Kwon	311490a720	Add script for benchmarking serving throughput (#145 )	2023-06-14 19:55:38 -07:00
Woosuk Kwon	8274ca23ac	Add docstrings for LLM (#137 )	2023-06-04 12:52:41 -07:00
Woosuk Kwon	211318d44a	Add throughput benchmarking script (#133 )	2023-05-28 03:20:05 -07:00

1 2