sroy745
|
ae151d73be
|
[Speculative Decoding] Enabling bonus token in speculative decoding for KV cache based models (#5765)
|
2024-07-10 16:02:47 -07:00 |
|
Nick Hill
|
faf71bcd4b
|
[Speculative Decoding] Add ProposerWorkerBase abstract class (#5252)
|
2024-06-05 14:53:05 -07:00 |
|
Cody Yu
|
ce532ff45c
|
[Speculative decoding] Improve n-gram efficiency (#4724)
|
2024-05-13 15:00:13 -07:00 |
|
Cody Yu
|
bc8ad68455
|
[Misc][Refactor] Introduce ExecuteModelData (#4540)
|
2024-05-03 17:47:07 -07:00 |
|
SangBin Cho
|
3521ba4f25
|
[Core][Model runner refactoring 1/N] Refactor attn metadata term (#4518)
|
2024-05-03 10:20:12 -07:00 |
|
leiwen83
|
b38e42fbca
|
[Speculative decoding] Add ngram prompt lookup decoding (#4237)
Co-authored-by: Lei Wen <wenlei03@qiyi.com>
|
2024-05-01 11:13:03 -07:00 |
|