6 Commits

Author SHA1 Message Date
Woosuk Kwon
e9d3f2ff77
Add memory analyzer & utomatically configure KV cache size (#6) 2023-03-11 23:23:14 -08:00
Woosuk Kwon
1a7eb7da61
Support beam search & parallel generation (#7) 2023-03-10 09:58:21 -08:00
Woosuk Kwon
cbf8779afa
Fix a bug in tying OPT embeddings (#1) 2023-02-24 16:29:36 -08:00
Woosuk Kwon
1ce1333573 Set default dtype to half 2023-02-23 21:31:39 +00:00
Woosuk Kwon
608f74ffe5 Minor 2023-02-22 18:08:25 +00:00
Woosuk Kwon
709a69176e Move worker/models -> models 2023-02-22 18:03:48 +00:00