Siyuan (Ryans) Zhuang
|
e3cec88aa5
|
Memcpy kernel for flash attention (#29)
* optimize
* add benchmark
* add assert
* add test
|
2023-04-10 18:22:49 -07:00 |
|
Woosuk Kwon
|
0f40557af6
|
Implement block copy kernel to optimize beam search (#32)
|
2023-04-07 17:45:07 -07:00 |
|
Woosuk Kwon
|
1a7eb7da61
|
Support beam search & parallel generation (#7)
|
2023-03-10 09:58:21 -08:00 |
|
Woosuk Kwon
|
c413c41cda
|
Add reshape_and_cache op
|
2023-02-18 19:22:57 +00:00 |
|
Woosuk Kwon
|
6d2f74efb3
|
Remove redundant fn
|
2023-02-16 09:24:42 +00:00 |
|
Woosuk Kwon
|
6f058c7ba8
|
Implement cache ops
|
2023-02-16 07:47:03 +00:00 |
|