5 Commits

Author SHA1 Message Date
Varun Sundar Rabindranath
c2ec430ab5
[Core] Multi-Step + Single Step Prefills via Chunked Prefill code path (#8378)
Co-authored-by: Varun Sundar Rabindranath <varun@neuralmagic.com>
2024-09-27 13:32:07 -07:00
afeldman-nm
a8c1d161a7
[Core] *Prompt* logprobs support in Multi-step (#8199) 2024-09-18 08:38:43 -07:00
afeldman-nm
e5cab71531
[Frontend] Add --logprobs argument to benchmark_serving.py (#8191) 2024-09-06 09:01:14 -07:00
afeldman-nm
428dd1445e
[Core] Logprobs support in Multi-step (#7652) 2024-08-29 19:19:08 -07:00
Alexander Matveev
9db93de20c
[Core] Add multi-step support to LLMEngine (#7789) 2024-08-23 12:45:53 -07:00