vllm/basic_correctness at d7740ea4dcee4ab75d7d6eef723f33cae957b288 - vllm - Luminance Code Repo

20231088/vllm

History

Lily Liu 43c413ec57

[Kernel] Use flashinfer for decoding (#4353 )

Co-authored-by: LiuXiaoxuanPKU <llilyliupku@gmail.com>

2024-05-03 15:51:27 -07:00

..

test_basic_correctness.py

[Kernel] Use flashinfer for decoding (#4353 )

2024-05-03 15:51:27 -07:00

test_chunked_prefill.py

[Bug fix][Core] assert num_new_tokens == 1 fails when SamplingParams.n is not 1 and max_tokens is large & Add tests for preemption (#4451 )

2024-05-01 19:24:13 -07:00

test_preemption.py

[Core] Ignore infeasible swap requests. (#4557 )

2024-05-02 14:31:20 -07:00