Zhuohan Li
|
1f01a18d39
|
Merge QKV into one linear layer (#15)
|
2023-04-02 00:23:29 -07:00 |
|
Woosuk Kwon
|
88c0268a18
|
Implement custom kernel for LLaMA rotary embedding (#14)
|
2023-03-30 11:04:21 -07:00 |
|
Woosuk Kwon
|
80a2f812f1
|
Implement LLaMA (#9)
Co-authored-by: Zhuohan Li <zhuohan123@gmail.com>
|
2023-03-30 12:25:32 +08:00 |
|
Zhuohan Li
|
2f49f15585
|
Support tensor parallel (#2)
|
2023-03-21 13:45:42 -07:00 |
|
Woosuk Kwon
|
1a7eb7da61
|
Support beam search & parallel generation (#7)
|
2023-03-10 09:58:21 -08:00 |
|
Woosuk Kwon
|
cbf8779afa
|
Fix a bug in tying OPT embeddings (#1)
|
2023-02-24 16:29:36 -08:00 |
|
Woosuk Kwon
|
de0fabbc5c
|
Fix sampler
|
2023-02-23 20:30:12 +00:00 |
|
Woosuk Kwon
|
86f9eb6d39
|
Fix typo
|
2023-02-23 20:19:24 +00:00 |
|
Woosuk Kwon
|
d4bc1a4d24
|
Add unoptimized OPT Attention
|
2023-02-23 09:31:55 +00:00 |
|
Woosuk Kwon
|
709a69176e
|
Move worker/models -> models
|
2023-02-22 18:03:48 +00:00 |
|