Woosuk Kwon
|
0f40557af6
|
Implement block copy kernel to optimize beam search (#32)
|
2023-04-07 17:45:07 -07:00 |
|
Zhuohan Li
|
2f49f15585
|
Support tensor parallel (#2)
|
2023-03-21 13:45:42 -07:00 |
|
Woosuk Kwon
|
1a7eb7da61
|
Support beam search & parallel generation (#7)
|
2023-03-10 09:58:21 -08:00 |
|
Woosuk Kwon
|
0deacbce6e
|
Implement single_query_cached_kv_attention kernel (#3)
|
2023-03-01 15:02:19 -08:00 |
|
Woosuk Kwon
|
6f058c7ba8
|
Implement cache ops
|
2023-02-16 07:47:03 +00:00 |
|
Woosuk Kwon
|
a1c67e6db8
|
Minor
|
2023-02-16 01:42:53 +00:00 |
|
Woosuk Kwon
|
9e68a6827e
|
Fix return type error
|
2023-02-16 01:33:03 +00:00 |
|
Woosuk Kwon
|
8edcabc737
|
Add warning
|
2023-02-16 01:28:17 +00:00 |
|
Woosuk Kwon
|
2f4887de77
|
Fix KVCache shape
|
2023-02-16 01:24:45 +00:00 |
|
Woosuk Kwon
|
bb59a3e730
|
Fix cache engine
|
2023-02-13 09:35:48 +00:00 |
|
Woosuk Kwon
|
e7bee2aa81
|
Add cache engine
|
2023-02-09 11:28:02 +00:00 |
|