vllm/tests/compile/test_full_graph.py

import pytest

from vllm.compilation.backends import vllm_backend

from .utils import TEST_MODELS, check_full_graph_support


@pytest.mark.parametrize("model_info", TEST_MODELS)
@pytest.mark.parametrize("backend", ["eager", vllm_backend])
def test_full_graph(model_info, backend):
    model = model_info[0]
    model_kwargs = model_info[1]
    check_full_graph_support(model, model_kwargs, backend, tp_size=1)
register custom op for flash attn and use from torch.ops (#7536) 2024-08-15 22:38:56 -07:00			`import pytest`

[Kernel] Fullgraph and opcheck tests (#8479) 2024-09-25 10:35:52 -04:00			`from vllm.compilation.backends import vllm_backend`
register custom op for flash attn and use from torch.ops (#7536) 2024-08-15 22:38:56 -07:00
[Kernel] Fullgraph and opcheck tests (#8479) 2024-09-25 10:35:52 -04:00			`from .utils import TEST_MODELS, check_full_graph_support`
[torch.compile] fix functionalization (#8480) 2024-09-14 09:46:04 -07:00

[Kernel] Fullgraph and opcheck tests (#8479) 2024-09-25 10:35:52 -04:00			`@pytest.mark.parametrize("model_info", TEST_MODELS)`
			`@pytest.mark.parametrize("backend", ["eager", vllm_backend])`
			`def test_full_graph(model_info, backend):`
			`model = model_info[0]`
			`model_kwargs = model_info[1]`
			`check_full_graph_support(model, model_kwargs, backend, tp_size=1)`