5 Commits

Author SHA1 Message Date
Zhuohan Li
ba0bfd40e2
TP/quantization/weight loading refactor part 1 - Simplify parallel linear logic (#1181) 2023-10-02 15:36:09 -07:00
Woosuk Kwon
fbd80ad409
Clean up kernel unit tests (#938) 2023-09-05 16:57:38 -07:00
Woosuk Kwon
d64bf1646c
Implement approximate GELU kernels (#828) 2023-08-23 07:43:21 +09:00
Woosuk Kwon
0b98ba15c7
Change the name to vLLM (#150) 2023-06-17 03:07:40 -07:00
Woosuk Kwon
825d8892b5
Use pytest format for unit tests (#107) 2023-05-17 17:11:23 -07:00