3 Commits

Author SHA1 Message Date
Woosuk Kwon
12659a0bd7
Add CUDA graph-based all reduce launcher (#26) 2023-04-05 11:16:57 -07:00
Zhuohan Li
c45f3c3ab6
Optimize tensor parallel execution speed (#17) 2023-04-01 00:51:08 +08:00
Zhuohan Li
2f49f15585
Support tensor parallel (#2) 2023-03-21 13:45:42 -07:00