7 Commits

Author SHA1 Message Date
Woosuk Kwon
cfae35b861
Add miscellaneous updates (#8) 2023-03-13 13:48:38 -07:00
Woosuk Kwon
e9d3f2ff77
Add memory analyzer & utomatically configure KV cache size (#6) 2023-03-11 23:23:14 -08:00
Woosuk Kwon
1a7eb7da61
Support beam search & parallel generation (#7) 2023-03-10 09:58:21 -08:00
Woosuk Kwon
04e5acc08e
Fix a bug in 1D input shape (#5) 2023-03-06 10:05:27 -08:00
Woosuk Kwon
3e9f991d6a
Use FlashAttention for multi_query_kv_attention (#4) 2023-03-01 21:13:08 -08:00
Woosuk Kwon
fa16389a2e Clean up the server script 2023-02-24 11:56:21 +00:00
Woosuk Kwon
afdbe5d373 [WIP] Add server script 2023-02-24 01:33:37 +00:00