20231088/vllm

History

[Misc]: Add support for goodput on guided benchmarking + TPOT calculation refactor (#13736 )

Signed-off-by: Brayden Zhong <b8zhong@uwaterloo.ca>

2025-02-26 19:06:47 +08:00

cutlass_benchmarks

[Misc] Add SPDX-License-Identifier headers to python source files (#12628 )

2025-02-02 11:58:18 -08:00

disagg_benchmarks

[Misc] Add SPDX-License-Identifier headers to python source files (#12628 )

2025-02-02 11:58:18 -08:00

fused_kernels

[Misc] Add SPDX-License-Identifier headers to python source files (#12628 )

2025-02-02 11:58:18 -08:00

kernels

[Misc] Improve LoRA spelling (#13831 )

2025-02-25 23:43:01 -08:00

overheads

[Misc] Add SPDX-License-Identifier headers to python source files (#12628 )

2025-02-02 11:58:18 -08:00

structured_schemas

[Benchmark] Benchmark structured output with datasets (#10557 )

2024-12-03 17:21:06 -07:00

backend_request_func.py

[Misc] Add SPDX-License-Identifier headers to python source files (#12628 )

2025-02-02 11:58:18 -08:00

benchmark_guided.py

[Bugfix] Fix benchmark script bug: inaccurate stats for vllm backend when max_model_len < input_len + output_len (#13691 )

2025-02-22 14:10:38 +08:00

benchmark_latency.py

Fix some issues with benchmark data output (#13641 )

2025-02-24 10:23:18 +08:00

benchmark_long_document_qa_throughput.py

[Misc] Add SPDX-License-Identifier headers to python source files (#12628 )

2025-02-02 11:58:18 -08:00

benchmark_prefix_caching.py

[Misc] Add SPDX-License-Identifier headers to python source files (#12628 )

2025-02-02 11:58:18 -08:00

benchmark_prioritization.py

[Bugfix] Fix benchmark script bug: inaccurate stats for vllm backend when max_model_len < input_len + output_len (#13691 )

2025-02-22 14:10:38 +08:00

benchmark_serving_guided.py

[Misc]: Add support for goodput on guided benchmarking + TPOT calculation refactor (#13736 )

2025-02-26 19:06:47 +08:00

benchmark_serving.py

Fix some issues with benchmark data output (#13641 )

2025-02-24 10:23:18 +08:00

benchmark_throughput.py

Fix some issues with benchmark data output (#13641 )

2025-02-24 10:23:18 +08:00

benchmark_utils.py

Fix some issues with benchmark data output (#13641 )

2025-02-24 10:23:18 +08:00

launch_tgi_server.sh

[CI/Build] Add shell script linting using shellcheck (#7925 )

2024-11-07 18:17:29 +00:00

README.md

[Benchmark] Add BurstGPT to benchmark_serving (#13063 )

2025-02-10 21:25:30 -08:00

sonnet.txt

feat(benchmarks): Add Prefix Caching Benchmark to Serving Benchmark (#3277 )

2024-03-27 13:39:26 -07:00

README.md

Benchmarking vLLM

Downloading the ShareGPT dataset

You can download the dataset by running:

wget https://huggingface.co/datasets/anon8231489123/ShareGPT_Vicuna_unfiltered/resolve/main/ShareGPT_V3_unfiltered_cleaned_split.json

Downloading the ShareGPT4V dataset

The json file refers to several image datasets (coco, llava, etc.). The benchmark scripts will ignore a datapoint if the referred image is missing.

wget https://huggingface.co/datasets/Lin-Chen/ShareGPT4V/resolve/main/sharegpt4v_instruct_gpt4-vision_cap100k.json
mkdir coco -p
wget http://images.cocodataset.org/zips/train2017.zip -O coco/train2017.zip
unzip coco/train2017.zip -d coco/

Downloading the BurstGPT dataset

You can download the BurstGPT v1.1 dataset by running:

wget https://github.com/HPMLL/BurstGPT/releases/download/v1.1/BurstGPT_without_fails_2.csv