4 Commits

Author SHA1 Message Date
Woosuk Kwon
c9d5b6d4a8
Replace FlashAttention with xformers (#70) 2023-05-05 02:01:08 -07:00
Woosuk Kwon
a96d63c21d
Add support for GPT-NeoX (Pythia) (#50) 2023-04-28 00:32:10 -07:00
Woosuk Kwon
897cb2ae28
Optimize data movement (#20) 2023-04-02 00:30:17 -07:00
Woosuk Kwon
88c0268a18
Implement custom kernel for LLaMA rotary embedding (#14) 2023-03-30 11:04:21 -07:00