20231088/vllm

History

Deprecate best_of Sampling Parameter in anticipation for vLLM V1 (#13997 )

Signed-off-by: vincent-4 <vincentzhongy+githubvincent4@gmail.com>
Signed-off-by: Brayden Zhong <b8zhong@uwaterloo.ca>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Co-authored-by: Brayden Zhong <b8zhong@uwaterloo.ca>
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

2025-03-05 20:22:43 +00:00

cutlass_benchmarks

Update deprecated Python 3.8 typing (#13971 )

2025-03-02 17:34:51 -08:00

disagg_benchmarks

[Misc] Add SPDX-License-Identifier headers to python source files (#12628 )

2025-02-02 11:58:18 -08:00

fused_kernels

Update deprecated Python 3.8 typing (#13971 )

2025-03-02 17:34:51 -08:00

kernels

[Misc] Add Qwen2MoeForCausalLM moe tuning support (#14276 )

2025-03-05 23:11:29 +08:00

overheads

[Misc] Add SPDX-License-Identifier headers to python source files (#12628 )

2025-02-02 11:58:18 -08:00

structured_schemas

[Benchmark] Benchmark structured output with datasets (#10557 )

2024-12-03 17:21:06 -07:00

backend_request_func.py

Deprecate best_of Sampling Parameter in anticipation for vLLM V1 (#13997 )

2025-03-05 20:22:43 +00:00

benchmark_guided.py

Update deprecated Python 3.8 typing (#13971 )

2025-03-02 17:34:51 -08:00

benchmark_latency.py

Update deprecated Python 3.8 typing (#13971 )

2025-03-02 17:34:51 -08:00

benchmark_long_document_qa_throughput.py

[Misc] Add SPDX-License-Identifier headers to python source files (#12628 )

2025-02-02 11:58:18 -08:00

benchmark_prefix_caching.py

Update deprecated Python 3.8 typing (#13971 )

2025-03-02 17:34:51 -08:00

benchmark_prioritization.py

Update deprecated Python 3.8 typing (#13971 )

2025-03-02 17:34:51 -08:00

benchmark_serving_guided.py

Update deprecated Python 3.8 typing (#13971 )

2025-03-02 17:34:51 -08:00

benchmark_serving.py

Deprecate best_of Sampling Parameter in anticipation for vLLM V1 (#13997 )

2025-03-05 20:22:43 +00:00

benchmark_throughput.py

Update deprecated Python 3.8 typing (#13971 )

2025-03-02 17:34:51 -08:00

benchmark_utils.py

Update deprecated Python 3.8 typing (#13971 )

2025-03-02 17:34:51 -08:00

launch_tgi_server.sh

[CI/Build] Add shell script linting using shellcheck (#7925 )

2024-11-07 18:17:29 +00:00

README.md

[Benchmark] Add BurstGPT to benchmark_serving (#13063 )

2025-02-10 21:25:30 -08:00

sonnet.txt

feat(benchmarks): Add Prefix Caching Benchmark to Serving Benchmark (#3277 )

2024-03-27 13:39:26 -07:00

README.md

Benchmarking vLLM

Downloading the ShareGPT dataset

You can download the dataset by running:

wget https://huggingface.co/datasets/anon8231489123/ShareGPT_Vicuna_unfiltered/resolve/main/ShareGPT_V3_unfiltered_cleaned_split.json

Downloading the ShareGPT4V dataset

The json file refers to several image datasets (coco, llava, etc.). The benchmark scripts will ignore a datapoint if the referred image is missing.

wget https://huggingface.co/datasets/Lin-Chen/ShareGPT4V/resolve/main/sharegpt4v_instruct_gpt4-vision_cap100k.json
mkdir coco -p
wget http://images.cocodataset.org/zips/train2017.zip -O coco/train2017.zip
unzip coco/train2017.zip -d coco/

Downloading the BurstGPT dataset

You can download the BurstGPT v1.1 dataset by running:

wget https://github.com/HPMLL/BurstGPT/releases/download/v1.1/BurstGPT_without_fails_2.csv