7 Commits

Author SHA1 Message Date
Cyrus Leung
eeec9e3390
[Frontend] Separate pooling APIs in offline inference (#11129)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2024-12-13 10:40:07 +00:00
Cyrus Leung
8c6de96ea1
[Model] Explicit interface for vLLM models and support OOT embedding models (#9108) 2024-10-07 06:10:35 +00:00
Roger Wang
26aa325f4f
[Core][VLM] Test registration for OOT multimodal models (#8717)
Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>
2024-10-04 10:38:25 -07:00
youkaichao
ea49e6a3c8
[misc][ci] fix cpu test with plugins (#7489) 2024-08-13 19:27:46 -07:00
youkaichao
16422ea76f
[misc][plugin] add plugin system implementation (#7426) 2024-08-13 16:24:17 -07:00
Cyrus Leung
7025b11d94
[Bugfix] Fix weight loading for Chameleon when TP>1 (#7410) 2024-08-13 05:33:41 +00:00
youkaichao
95baec828f
[Core] enable out-of-tree model register (#3871) 2024-04-06 17:11:41 -07:00