6 Commits

Author SHA1 Message Date
Siyuan (Ryans) Zhuang
e3cec88aa5
Memcpy kernel for flash attention (#29)
* optimize

* add benchmark

* add assert

* add test
2023-04-10 18:22:49 -07:00
Woosuk Kwon
0f40557af6
Implement block copy kernel to optimize beam search (#32) 2023-04-07 17:45:07 -07:00
Woosuk Kwon
1a7eb7da61
Support beam search & parallel generation (#7) 2023-03-10 09:58:21 -08:00
Woosuk Kwon
c413c41cda Add reshape_and_cache op 2023-02-18 19:22:57 +00:00
Woosuk Kwon
6d2f74efb3 Remove redundant fn 2023-02-16 09:24:42 +00:00
Woosuk Kwon
6f058c7ba8 Implement cache ops 2023-02-16 07:47:03 +00:00