vllm/.buildkite/test-pipeline.yaml

# In this file, you can add more tests to run either by adding a new step or
# adding a new command to an existing step. See different options here for examples.
# This script will be feed into Jinja template in `test-template.j2` to generate
# the final pipeline yaml file.

steps:
- label: Regression Test
  command: pytest -v -s test_regression.py
  working_dir: "/vllm-workspace/tests" # optional

- label: AsyncEngine Test
  command: pytest -v -s async_engine

- label: Basic Correctness Test
  command: pytest -v -s --forked basic_correctness

- label: Core Test
  command: pytest -v -s core

- label: Distributed Comm Ops Test
  command: pytest -v -s --forked test_comm_ops.py
  working_dir: "/vllm-workspace/tests/distributed"
  num_gpus: 2 # only support 1 or 2 for now.

- label: Distributed Tests
  working_dir: "/vllm-workspace/tests/distributed"
  num_gpus: 2 # only support 1 or 2 for now.
  commands:
  - pytest -v -s --forked test_pynccl.py
  - TEST_DIST_MODEL=facebook/opt-125m pytest -v -s --forked test_basic_distributed_correctness.py
  - TEST_DIST_MODEL=meta-llama/Llama-2-7b-hf pytest -v -s --forked test_basic_distributed_correctness.py

- label: Engine Test
  command: pytest -v -s engine tokenization test_sequence.py test_config.py

- label: Entrypoints Test
  command: pytest -v -s entrypoints

- label: Kernels Test %N
  command: pytest -v -s kernels --shard-id=$$BUILDKITE_PARALLEL_JOB --num-shards=$$BUILDKITE_PARALLEL_JOB_COUNT
  parallelism: 4

- label: Models Test
  commands:
    - bash ../.buildkite/download-images.sh
    - pytest -v -s models --ignore=models/test_llava.py  --forked
  soft_fail: true

- label: Llava Test
  commands:
    - bash ../.buildkite/download-images.sh
    - pytest -v -s models/test_llava.py

- label: Prefix Caching Test
  commands:
    - pytest -v -s prefix_caching

- label: Samplers Test
  command: pytest -v -s samplers

- label: LogitsProcessor Test
  command: pytest -v -s test_logits_processor.py

- label: Worker Test
  command: pytest -v -s worker

- label: Speculative decoding tests
  command: pytest -v -s spec_decode

- label: LoRA Test %N
  command: pytest -v -s lora --shard-id=$$BUILDKITE_PARALLEL_JOB --num-shards=$$BUILDKITE_PARALLEL_JOB_COUNT
  parallelism: 4

- label: Metrics Test
  command: pytest -v -s metrics

- label: Benchmarks
  working_dir: "/vllm-workspace/.buildkite"
  commands:
  - pip install aiohttp
  - bash run-benchmarks.sh

- label: Documentation Build
  working_dir: "/vllm-workspace/docs"
  no_gpu: True
  commands:
  - pip install -r requirements-docs.txt
  - SPHINXOPTS=\"-W\" make html
[CI] Add Buildkite (#2355) 2024-01-14 12:37:58 -08:00			`# In this file, you can add more tests to run either by adding a new step or`
			`# adding a new command to an existing step. See different options here for examples.`
			# This script will be feed into Jinja template in `test-template.j2` to generate
			`# the final pipeline yaml file.`

			`steps:`
			`- label: Regression Test`
			`command: pytest -v -s test_regression.py`
			`working_dir: "/vllm-workspace/tests" # optional`

			`- label: AsyncEngine Test`
			`command: pytest -v -s async_engine`

[Test] Add basic correctness test (#2908) 2024-02-18 16:44:50 -08:00			`- label: Basic Correctness Test`
			`command: pytest -v -s --forked basic_correctness`
[Hotfix] [Debug] test_openai_server.py::test_guided_regex_completion (#3383) 2024-03-13 17:02:21 -07:00
[Tests] Add block manager and scheduler tests (#3108) 2024-03-06 11:23:34 +09:00			`- label: Core Test`
			`command: pytest -v -s core`
[Test] Add basic correctness test (#2908) 2024-02-18 16:44:50 -08:00
			`- label: Distributed Comm Ops Test`
			`command: pytest -v -s --forked test_comm_ops.py`
			`working_dir: "/vllm-workspace/tests/distributed"`
			`num_gpus: 2 # only support 1 or 2 for now.`

[Core] remove cupy dependency (#3625) 2024-03-27 00:33:26 -07:00			`- label: Distributed Tests`
[CI] Add Buildkite (#2355) 2024-01-14 12:37:58 -08:00			`working_dir: "/vllm-workspace/tests/distributed"`
			`num_gpus: 2 # only support 1 or 2 for now.`
[Core] remove cupy dependency (#3625) 2024-03-27 00:33:26 -07:00			`commands:`
			`- pytest -v -s --forked test_pynccl.py`
			`- TEST_DIST_MODEL=facebook/opt-125m pytest -v -s --forked test_basic_distributed_correctness.py`
			`- TEST_DIST_MODEL=meta-llama/Llama-2-7b-hf pytest -v -s --forked test_basic_distributed_correctness.py`
[CI] Add Buildkite (#2355) 2024-01-14 12:37:58 -08:00
			`- label: Engine Test`
[Testing] Add test_config.py to CI (#3437) 2024-03-18 12:48:45 -07:00			`command: pytest -v -s engine tokenization test_sequence.py test_config.py`
[CI] Add Buildkite (#2355) 2024-01-14 12:37:58 -08:00
OpenAI Server refactoring (#2360) 2024-01-17 05:33:14 +00:00			`- label: Entrypoints Test`
			`command: pytest -v -s entrypoints`

[CI] Shard tests for LoRA and Kernels to speed up (#3445) 2024-03-17 14:56:30 -07:00			`- label: Kernels Test %N`
			`command: pytest -v -s kernels --shard-id=$$BUILDKITE_PARALLEL_JOB --num-shards=$$BUILDKITE_PARALLEL_JOB_COUNT`
			`parallelism: 4`
[CI] Add Buildkite (#2355) 2024-01-14 12:37:58 -08:00
			`- label: Models Test`
			`commands:`
[Feature] Add vision language model support. (#3042) 2024-03-25 14:16:30 -07:00			`- bash ../.buildkite/download-images.sh`
			`- pytest -v -s models --ignore=models/test_llava.py --forked`
[CI] Add Buildkite (#2355) 2024-01-14 12:37:58 -08:00			`soft_fail: true`

[Feature] Add vision language model support. (#3042) 2024-03-25 14:16:30 -07:00			`- label: Llava Test`
			`commands:`
			`- bash ../.buildkite/download-images.sh`
			`- pytest -v -s models/test_llava.py`

[Experimental] Prefix Caching Support (#1669) Co-authored-by: DouHappy <2278958187@qq.com> Co-authored-by: Zhuohan Li <zhuohan123@gmail.com> 2024-01-17 16:32:10 -08:00			`- label: Prefix Caching Test`
			`commands:`
			`- pytest -v -s prefix_caching`

[CI] Add Buildkite (#2355) 2024-01-14 12:37:58 -08:00			`- label: Samplers Test`
[1/n][Chunked Prefill] Refactor input query shapes (#3236) 2024-03-21 06:46:05 +09:00			`command: pytest -v -s samplers`
[CI] Add Buildkite (#2355) 2024-01-14 12:37:58 -08:00
Migrate `logits` computation and gather to `model_runner` (#3233) 2024-03-21 07:25:01 +08:00			`- label: LogitsProcessor Test`
			`command: pytest -v -s test_logits_processor.py`

[CI] Add Buildkite (#2355) 2024-01-14 12:37:58 -08:00			`- label: Worker Test`
			`command: pytest -v -s worker`

[Speculative decoding 3/9] Worker which speculates, scores, and applies rejection sampling (#3103) 2024-03-08 23:32:46 -08:00			`- label: Speculative decoding tests`
			`command: pytest -v -s spec_decode`

[CI] Shard tests for LoRA and Kernels to speed up (#3445) 2024-03-17 14:56:30 -07:00			`- label: LoRA Test %N`
[1/n][Chunked Prefill] Refactor input query shapes (#3236) 2024-03-21 06:46:05 +09:00			`command: pytest -v -s lora --shard-id=$$BUILDKITE_PARALLEL_JOB --num-shards=$$BUILDKITE_PARALLEL_JOB_COUNT`
[CI] Shard tests for LoRA and Kernels to speed up (#3445) 2024-03-17 14:56:30 -07:00			`parallelism: 4`
[Experimental] Add multi-LoRA support (#1804) Co-authored-by: Chen Shen <scv119@gmail.com> Co-authored-by: Shreyas Krishnaswamy <shrekris@anyscale.com> Co-authored-by: Avnish Narayan <avnish@anyscale.com> 2024-01-24 00:26:37 +01:00
Include tokens from prompt phase in `counter_generation_tokens` (#2802) 2024-02-23 00:00:12 +02:00			`- label: Metrics Test`
			`command: pytest -v -s metrics`

[CI] Add Buildkite (#2355) 2024-01-14 12:37:58 -08:00			`- label: Benchmarks`
			`working_dir: "/vllm-workspace/.buildkite"`
			`commands:`
			`- pip install aiohttp`
			`- bash run-benchmarks.sh`
[CI] Ensure documentation build is checked in CI (#2842) 2024-02-12 22:53:07 -08:00
			`- label: Documentation Build`
			`working_dir: "/vllm-workspace/docs"`
			`no_gpu: True`
			`commands:`
			`- pip install -r requirements-docs.txt`
			`- SPHINXOPTS=\"-W\" make html`