vllm/tests/tpu/test_custom_dispatcher.py

import os

from vllm.config import CompilationLevel

from ..utils import compare_two_settings

# --enforce-eager on TPU causes graph compilation
# this times out default Health Check in the MQLLMEngine,
# so we set the timeout here to 30s
os.environ["VLLM_RPC_TIMEOUT"] = "30000"


def test_custom_dispatcher():
    compare_two_settings(
        "google/gemma-2b",
        arg1=["--enforce-eager", "-O",
              str(CompilationLevel.DYNAMO_ONCE)],
        arg2=["--enforce-eager", "-O",
              str(CompilationLevel.DYNAMO_AS_IS)],
        env1={},
        env2={})
[Core][Bugfix][Perf] Introduce `MQLLMEngine` to avoid `asyncio` OH (#8157) Co-authored-by: Nick Hill <nickhill@us.ibm.com> Co-authored-by: rshaw@neuralmagic.com <rshaw@neuralmagic.com> Co-authored-by: Robert Shaw <114415538+robertgshaw2-neuralmagic@users.noreply.github.com> Co-authored-by: Simon Mo <simon.mo@hey.com> 2024-09-18 09:56:58 -04:00			`import os`

[2/N][torch.compile] make compilation cfg part of vllm cfg (#10383) Signed-off-by: youkaichao <youkaichao@gmail.com> 2024-11-16 18:02:14 -08:00			`from vllm.config import CompilationLevel`
[torch.compile] integration with compilation control (#9058) 2024-10-10 12:39:36 -07:00
[torch.compile] avoid Dynamo guard evaluation overhead (#7898) Co-authored-by: Woosuk Kwon <woosuk.kwon@berkeley.edu> 2024-08-28 16:10:12 -07:00			`from ..utils import compare_two_settings`

[Core][Bugfix][Perf] Introduce `MQLLMEngine` to avoid `asyncio` OH (#8157) Co-authored-by: Nick Hill <nickhill@us.ibm.com> Co-authored-by: rshaw@neuralmagic.com <rshaw@neuralmagic.com> Co-authored-by: Robert Shaw <114415538+robertgshaw2-neuralmagic@users.noreply.github.com> Co-authored-by: Simon Mo <simon.mo@hey.com> 2024-09-18 09:56:58 -04:00			`# --enforce-eager on TPU causes graph compilation`
			`# this times out default Health Check in the MQLLMEngine,`
			`# so we set the timeout here to 30s`
			`os.environ["VLLM_RPC_TIMEOUT"] = "30000"`

[torch.compile] avoid Dynamo guard evaluation overhead (#7898) Co-authored-by: Woosuk Kwon <woosuk.kwon@berkeley.edu> 2024-08-28 16:10:12 -07:00
			`def test_custom_dispatcher():`
[torch.compile] integration with compilation control (#9058) 2024-10-10 12:39:36 -07:00			`compare_two_settings(`
			`"google/gemma-2b",`
[6/N] torch.compile rollout to users (#10437) Signed-off-by: youkaichao <youkaichao@gmail.com> 2024-11-19 10:09:03 -08:00			`arg1=["--enforce-eager", "-O",`
			`str(CompilationLevel.DYNAMO_ONCE)],`
			`arg2=["--enforce-eager", "-O",`
			`str(CompilationLevel.DYNAMO_AS_IS)],`
			`env1={},`
			`env2={})`