diff --git a/benchmarks/README.md b/benchmarks/README.md
index 367ef934..edc10d8b 100644
--- a/benchmarks/README.md
+++ b/benchmarks/README.md
@@ -1,29 +1,181 @@
# Benchmarking vLLM
-## Downloading the ShareGPT dataset
+This README guides you through running benchmark tests with the extensive
+datasets supported on vLLM. It’s a living document, updated as new features and datasets
+become available.
-You can download the dataset by running:
+## Dataset Overview
+
+
+
+
+ Dataset |
+ Online |
+ Offline |
+ Data Path |
+
+
+
+
+ ShareGPT |
+ ✅ |
+ ✅ |
+ wget https://huggingface.co/datasets/anon8231489123/ShareGPT_Vicuna_unfiltered/resolve/main/ShareGPT_V3_unfiltered_cleaned_split.json |
+
+
+ BurstGPT |
+ ✅ |
+ ✅ |
+ wget https://github.com/HPMLL/BurstGPT/releases/download/v1.1/BurstGPT_without_fails_2.csv |
+
+
+ Sonnet |
+ ✅ |
+ ✅ |
+ Local file: benchmarks/sonnet.txt |
+
+
+ Random |
+ ✅ |
+ ✅ |
+ synthetic |
+
+
+ HuggingFace |
+ ✅ |
+ 🚧 |
+ Specify your dataset path on HuggingFace |
+
+
+ VisionArena |
+ ✅ |
+ 🚧 |
+ lmarena-ai/vision-arena-bench-v0.1 (a HuggingFace dataset) |
+
+
+
+✅: supported
+🚧: to be supported
+
+**Note**: VisionArena’s `dataset-name` should be set to `hf`
+
+---
+## Example - Online Benchmark
+
+First start serving your model
```bash
-wget https://huggingface.co/datasets/anon8231489123/ShareGPT_Vicuna_unfiltered/resolve/main/ShareGPT_V3_unfiltered_cleaned_split.json
+MODEL_NAME="NousResearch/Hermes-3-Llama-3.1-8B"
+vllm serve ${MODEL_NAME} --disable-log-requests
```
-## Downloading the ShareGPT4V dataset
-
-The json file refers to several image datasets (coco, llava, etc.). The benchmark scripts
-will ignore a datapoint if the referred image is missing.
+Then run the benchmarking script
```bash
-wget https://huggingface.co/datasets/Lin-Chen/ShareGPT4V/resolve/main/sharegpt4v_instruct_gpt4-vision_cap100k.json
-mkdir coco -p
-wget http://images.cocodataset.org/zips/train2017.zip -O coco/train2017.zip
-unzip coco/train2017.zip -d coco/
+# download dataset
+# wget https://huggingface.co/datasets/anon8231489123/ShareGPT_Vicuna_unfiltered/resolve/main/ShareGPT_V3_unfiltered_cleaned_split.json
+MODEL_NAME="NousResearch/Hermes-3-Llama-3.1-8B"
+NUM_PROMPTS=10
+BACKEND="openai-chat"
+DATASET_NAME="sharegpt"
+DATASET_PATH="/ShareGPT_V3_unfiltered_cleaned_split.json"
+python3 benchmarks/benchmark_serving.py --backend ${BACKEND} --model ${MODEL_NAME} --endpoint /v1/chat/completions --dataset-name ${DATASET_NAME} --dataset-path ${DATASET_PATH} --num-prompts ${NUM_PROMPTS}
```
-# Downloading the BurstGPT dataset
+If successful, you will see the following output
-You can download the BurstGPT v1.1 dataset by running:
+```
+============ Serving Benchmark Result ============
+Successful requests: 10
+Benchmark duration (s): 5.78
+Total input tokens: 1369
+Total generated tokens: 2212
+Request throughput (req/s): 1.73
+Output token throughput (tok/s): 382.89
+Total Token throughput (tok/s): 619.85
+---------------Time to First Token----------------
+Mean TTFT (ms): 71.54
+Median TTFT (ms): 73.88
+P99 TTFT (ms): 79.49
+-----Time per Output Token (excl. 1st token)------
+Mean TPOT (ms): 7.91
+Median TPOT (ms): 7.96
+P99 TPOT (ms): 8.03
+---------------Inter-token Latency----------------
+Mean ITL (ms): 7.74
+Median ITL (ms): 7.70
+P99 ITL (ms): 8.39
+==================================================
+```
+
+### VisionArena Benchmark for Vision Language Models
```bash
-wget https://github.com/HPMLL/BurstGPT/releases/download/v1.1/BurstGPT_without_fails_2.csv
+# need a model with vision capability here
+vllm serve Qwen/Qwen2-VL-7B-Instruct --disable-log-requests
```
+
+```bash
+MODEL_NAME="Qwen/Qwen2-VL-7B-Instruct"
+NUM_PROMPTS=10
+BACKEND="openai-chat"
+DATASET_NAME="hf"
+DATASET_PATH="lmarena-ai/vision-arena-bench-v0.1"
+DATASET_SPLIT='train'
+
+python3 benchmarks/benchmark_serving.py \
+ --backend "${BACKEND}" \
+ --model "${MODEL_NAME}" \
+ --endpoint "/v1/chat/completions" \
+ --dataset-name "${DATASET_NAME}" \
+ --dataset-path "${DATASET_PATH}" \
+ --hf-split "${DATASET_SPLIT}" \
+ --num-prompts "${NUM_PROMPTS}"
+```
+
+---
+## Example - Offline Throughput Benchmark
+
+```bash
+MODEL_NAME="NousResearch/Hermes-3-Llama-3.1-8B"
+NUM_PROMPTS=10
+DATASET_NAME="sonnet"
+DATASET_PATH="benchmarks/sonnet.txt"
+
+python3 benchmarks/benchmark_throughput.py \
+ --model "${MODEL_NAME}" \
+ --dataset-name "${DATASET_NAME}" \
+ --dataset-path "${DATASET_PATH}" \
+ --num-prompts "${NUM_PROMPTS}"
+ ```
+
+If successful, you will see the following output
+
+```
+Throughput: 7.35 requests/s, 4789.20 total tokens/s, 1102.83 output tokens/s
+```
+
+### Benchmark with LoRA Adapters
+
+``` bash
+MODEL_NAME="meta-llama/Llama-2-7b-hf"
+BACKEND="vllm"
+DATASET_NAME="sharegpt"
+DATASET_PATH="/home/jovyan/data/vllm_benchmark_datasets/ShareGPT_V3_unfiltered_cleaned_split.json"
+NUM_PROMPTS=10
+MAX_LORAS=2
+MAX_LORA_RANK=8
+ENABLE_LORA="--enable-lora"
+LORA_PATH="yard1/llama-2-7b-sql-lora-test"
+
+python3 benchmarks/benchmark_throughput.py \
+ --model "${MODEL_NAME}" \
+ --backend "${BACKEND}" \
+ --dataset_path "${DATASET_PATH}" \
+ --dataset_name "${DATASET_NAME}" \
+ --num-prompts "${NUM_PROMPTS}" \
+ --max-loras "${MAX_LORAS}" \
+ --max-lora-rank "${MAX_LORA_RANK}" \
+ ${ENABLE_LORA} \
+ --lora-path "${LORA_PATH}"
+ ```