vllm/test_cpu_offload.py at e165528778d1bfeb8e9bd8a33d6cd64fb6c78e4e - vllm - Luminance Code Repo

20231088/vllm

Michael Goin e165528778

[CI] Move quantization cpu offload tests out of fastcheck (#7574 )

2024-08-15 21:16:20 -07:00

7 lines

176 B

Python

Raw Blame History

 from ..utils import compare_two_settings
 def test_cpu_offload():
     compare_two_settings("meta-llama/Llama-2-7b-hf", [],
                          ["--cpu-offload-gb", "4"])