22 Commits

Author SHA1 Message Date
Roger Wang
f17a1a8f96
[Misc] Make Serving Benchmark More User-friendly (#5044) 2024-05-25 17:28:16 +00:00
Kuntai Du
c3af44722c
[Doc]Add documentation to benchmarking script when running TGI (#4920) 2024-05-20 20:16:57 +00:00
Roger Wang
7923dcad12
[Misc] Update ShareGPT Dataset Sampling in Serving Benchmark (#4279) 2024-04-24 09:49:13 -07:00
Chang Su
819a309c0f
[Bugfix] Fix args in benchmark_serving (#3836)
Co-authored-by: Roger Wang <ywang@roblox.com>
2024-04-04 07:41:05 +00:00
Roger Wang
45b6ef6513
feat(benchmarks): Add Prefix Caching Benchmark to Serving Benchmark (#3277) 2024-03-27 13:39:26 -07:00
SangBin Cho
01bfb22b41
[CI] Try introducing isort. (#3495) 2024-03-25 07:59:47 -07:00
Simon Mo
8e67598aa6
[Misc] fix line length for entire codebase (#3444) 2024-03-16 00:36:29 -07:00
TianYu GUO
1ece1ae829
[Minor Fix] Fix comments in benchmark_serving (#3252) 2024-03-07 22:22:59 -08:00
Massimiliano Pronesti
93dc5a2870
chore(vllm): codespell for spell checking (#2820) 2024-02-21 18:56:01 -08:00
Ronen Schaffer
d7f396486e
Update comment (#2934) 2024-02-21 18:18:37 -08:00
Roger Wang
a4211a4dc3
Serving Benchmark Refactoring (#2433) 2024-02-12 22:53:00 -08:00
Simon Mo
1e4277d2d1
lint: format all python file instead of just source code (#2567) 2024-01-23 15:53:06 -08:00
Harry Mellor
63e835cbcc
Fix progress bar and allow HTTPS in benchmark_serving.py (#2552) 2024-01-22 14:40:31 -08:00
Harry Mellor
2709c0009a
Support OpenAI API server in benchmark_serving.py (#2172) 2024-01-18 20:34:08 -08:00
Antoni Baum
acbed3ef40
Use monotonic time where appropriate (#1249) 2023-10-02 19:22:05 -07:00
Ricardo Lu
8c4b2592fb
fix: enable trust-remote-code in api server & benchmark. (#509) 2023-07-19 17:06:15 -07:00
Woosuk Kwon
4338cc4750
[Tokenizer] Add an option to specify tokenizer (#284) 2023-06-28 09:46:58 -07:00
Zhuohan Li
43710e8d09
[Fix] Fix default port number in benchmark scripts (#265) 2023-06-26 13:15:35 -07:00
Woosuk Kwon
3f92038b99
Add comments on swap space (#154) 2023-06-18 11:39:35 -07:00
Woosuk Kwon
0b98ba15c7
Change the name to vLLM (#150) 2023-06-17 03:07:40 -07:00
Zhuohan Li
e5464ee484
Rename servers to engines (#152) 2023-06-17 17:25:21 +08:00
Woosuk Kwon
311490a720
Add script for benchmarking serving throughput (#145) 2023-06-14 19:55:38 -07:00