Lucas Wilkinson
|
ab5bbf5ae3
|
[Bugfix][Kernel] Fix CUDA 11.8 being broken by FA3 build (#12375)
Signed-off-by: Lucas Wilkinson <lwilkinson@neuralmagic.com>
|
2025-01-24 15:27:59 +00:00 |
|
Lucas Wilkinson
|
978b45f399
|
[Kernel] Flash Attention 3 Support (#12093)
Signed-off-by: Lucas Wilkinson <lwilkinson@neuralmagic.com>
|
2025-01-23 06:45:48 -08:00 |
|
Woosuk Kwon
|
73001445fb
|
[V1] Implement Cascade Attention (#11635)
Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
|
2025-01-01 21:56:46 +09:00 |
|