Lu Fang
|
8c0d15d5c5
|
[Misc][Easy] Annotate unused vars in the csrc files (#14798)
Signed-off-by: Lu Fang <lufang@fb.com>
|
2025-03-15 12:40:09 +08:00 |
|
Elfie Guo
|
0794e7446e
|
[Misc] Add multipstep chunked-prefill support for FlashInfer (#10467)
|
2025-01-15 12:47:49 +08:00 |
|
Pavani Majety
|
b6dde33019
|
[Core] Flashinfer - Remove advance step size restriction (#10282)
|
2024-11-13 16:29:32 +08:00 |
|
Varun Sundar Rabindranath
|
afb050b29d
|
[Core] CUDA Graphs for Multi-Step + Chunked-Prefill (#8645)
Co-authored-by: Varun Sundar Rabindranath <varun@neuralmagic.com>
|
2024-10-02 19:44:39 +00:00 |
|
Varun Sundar Rabindranath
|
c2ec430ab5
|
[Core] Multi-Step + Single Step Prefills via Chunked Prefill code path (#8378)
Co-authored-by: Varun Sundar Rabindranath <varun@neuralmagic.com>
|
2024-09-27 13:32:07 -07:00 |
|
Tyler Michael Smith
|
71d21c73ab
|
[Bugfix] Fixup advance_step.cu warning (#8815)
|
2024-09-26 16:23:45 -07:00 |
|
William Lin
|
a6c0f3658d
|
[multi-step] add flashinfer backend (#7928)
|
2024-09-12 11:16:22 -07:00 |
|
Alexander Matveev
|
e76466dde2
|
[Core] draft_model_runner: Implement prepare_inputs on GPU for advance_step (#6338)
|
2024-07-17 14:30:28 -07:00 |
|