vllm/core at 813f249f022a44aded2a843f0c7108ea0b7d1f6b - vllm - Luminance Code Repo

20231088/vllm

History

Sungjae Lee c31d4a57a6

[Core] support LoRA and prompt adapter in content-based hashing for Block Manager v2 prefix caching (#8240 )

2024-12-13 07:51:25 -08:00

..

[Core] support LoRA and prompt adapter in content-based hashing for Block Manager v2 prefix caching (#8240 )

2024-12-13 07:51:25 -08:00

__init__.py

[Tests] Add block manager and scheduler tests (#3108 )

2024-03-05 18:23:34 -08:00

test_chunked_prefill_scheduler.py

[Bugfix] Fix for Spec model TP + Chunked Prefill (#10232 )

2024-11-26 09:11:16 -08:00

test_num_computed_tokens_update.py

[Core] Deprecating block manager v1 and make block manager v2 default (#8704 )

2024-10-17 11:38:15 -05:00

test_scheduler_encoder_decoder.py

[Misc] Split up pooling tasks (#10820 )

2024-12-11 01:28:00 -08:00

test_scheduler.py

Prefix Cache Aware Scheduling [1/n] (#10128 )

2024-11-22 21:15:55 -08:00

test_serialization.py

[Core] Optimize SPMD architecture with delta + serialization optimization (#7109 )

2024-08-18 17:57:20 -07:00

utils.py

[Core] support LoRA and prompt adapter in content-based hashing for Block Manager v2 prefix caching (#8240 )

2024-12-13 07:51:25 -08:00