Michael Goin
|
5f6d10c14c
|
[CI/Build] Enforce style for C++ and CUDA code with clang-format (#4722)
|
2024-05-22 07:18:41 +00:00 |
|
Woosuk Kwon
|
602358f8a8
|
Add kernel for GeGLU with approximate GELU (#3337)
|
2024-03-12 22:06:17 -07:00 |
|
Woosuk Kwon
|
fd5dcc5c81
|
Optimize GeGLU layer in Gemma (#2975)
|
2024-02-21 20:17:52 -08:00 |
|
Jee Li
|
77af974b40
|
[FIX] Support non-zero CUDA devices in custom kernels (#1959)
|
2024-01-02 19:09:59 -08:00 |
|
TJian
|
6ccc0bfffb
|
Merge EmbeddedLLM/vllm-rocm into vLLM main (#1836)
Co-authored-by: Philipp Moritz <pcmoritz@gmail.com>
Co-authored-by: Amir Balwel <amoooori04@gmail.com>
Co-authored-by: root <kuanfu.liu@akirakan.com>
Co-authored-by: tjtanaa <tunjian.tan@embeddedllm.com>
Co-authored-by: kuanfu <kuanfu.liu@embeddedllm.com>
Co-authored-by: miloice <17350011+kliuae@users.noreply.github.com>
|
2023-12-07 23:16:52 -08:00 |
|
Antoni Baum
|
9f669a9a7c
|
Support YaRN models (#1264)
Signed-off-by: Antoni Baum <antoni.baum@protonmail.com>
Co-authored-by: Viktor Ferenczi <viktor@ferenczi.eu>
Co-authored-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
|
2023-11-03 14:12:48 -07:00 |
|
Woosuk Kwon
|
c1376e0f82
|
Change scheduler & input tensor shape (#1381)
|
2023-10-16 17:48:42 -07:00 |
|
Woosuk Kwon
|
8ce9c50d40
|
Avoid compiling kernels for double data type (#933)
|
2023-09-02 14:59:47 +09:00 |
|
Woosuk Kwon
|
d64bf1646c
|
Implement approximate GELU kernels (#828)
|
2023-08-23 07:43:21 +09:00 |
|
Woosuk Kwon
|
0b98ba15c7
|
Change the name to vLLM (#150)
|
2023-06-17 03:07:40 -07:00 |
|
Woosuk Kwon
|
e070829ae8
|
Support bfloat16 data type (#54)
|
2023-05-03 14:09:44 -07:00 |
|
Woosuk Kwon
|
897cb2ae28
|
Optimize data movement (#20)
|
2023-04-02 00:30:17 -07:00 |
|