3 Commits

Author SHA1 Message Date
sasha0552
9b9a10d6cb
[Frontend] Dynamic RoPE scaling (#4638) 2024-05-22 01:32:35 -04:00
Antoni Baum
69e1d2fb69
[Core] Refactor model loading code (#4097) 2024-04-16 11:34:39 -07:00
陈序
54be8a0be2
Fix assertion failure in Qwen 1.5 with prefix caching enabled (#3373)
Co-authored-by: Cade Daniel <edacih@gmail.com>
2024-03-14 13:56:57 -07:00