sroy745
|
f3a507f1d3
|
[Core] Add an environment variable which needs to be set explicitly to allow BlockSpaceManagerV1 (#9149)
|
2024-10-10 14:17:17 +08:00 |
|
sroy745
|
fc3afc20df
|
Fix tests in test_chunked_prefill_scheduler which fail with BlockManager V2 (#8752)
|
2024-09-24 21:26:36 -07:00 |
|
Cody Yu
|
e3580537a4
|
[Performance] Enable chunked prefill and prefix caching together (#7753)
|
2024-08-28 00:36:31 -07:00 |
|
Megha Agarwal
|
2eedede875
|
[Core] Asynchronous Output Processor (#7049)
Co-authored-by: Alexander Matveev <alexm@neuralmagic.com>
|
2024-08-26 20:53:20 -07:00 |
|
Cyrus Leung
|
0e9164b40a
|
[mypy] Enable type checking for test directory (#5017)
|
2024-06-15 04:45:31 +00:00 |
|
SangBin Cho
|
847cdcca1c
|
[CI] Upgrade codespell version. (#5381)
|
2024-06-12 10:06:14 -07:00 |
|
youkaichao
|
20cfcdec99
|
[Core][Optimization] change python dict to pytorch tensor for blocks to swap (#4659)
|
2024-05-08 12:07:05 -07:00 |
|
SangBin Cho
|
0f8a91401c
|
[Core] Ignore infeasible swap requests. (#4557)
|
2024-05-02 14:31:20 -07:00 |
|
SangBin Cho
|
67b4221a61
|
[Core][5/N] Fully working chunked prefill e2e (#3884)
|
2024-04-10 17:56:48 -07:00 |
|
SangBin Cho
|
18de883489
|
[Chunked Prefill][4/n] Chunked prefill scheduler. (#3853)
|
2024-04-05 10:17:58 -07:00 |
|