Michael Goin
|
5f6d10c14c
|
[CI/Build] Enforce style for C++ and CUDA code with clang-format (#4722)
|
2024-05-22 07:18:41 +00:00 |
|
mawong-amd
|
b6d103542c
|
[Kernel] Layernorm performance optimization (#3662)
|
2024-03-30 14:26:38 -07:00 |
|
kliuae
|
c9415c19d3
|
[ROCm] Fix warp and lane calculation in blockReduceSum (#3321)
|
2024-03-11 13:14:07 -07:00 |
|
Douglas Lehr
|
e4a28e5316
|
[ROCM] Fix blockReduceSum to use correct warp counts for ROCm and CUDA (#3262)
|
2024-03-10 15:27:45 -07:00 |
|
TJian
|
6ccc0bfffb
|
Merge EmbeddedLLM/vllm-rocm into vLLM main (#1836)
Co-authored-by: Philipp Moritz <pcmoritz@gmail.com>
Co-authored-by: Amir Balwel <amoooori04@gmail.com>
Co-authored-by: root <kuanfu.liu@akirakan.com>
Co-authored-by: tjtanaa <tunjian.tan@embeddedllm.com>
Co-authored-by: kuanfu <kuanfu.liu@embeddedllm.com>
Co-authored-by: miloice <17350011+kliuae@users.noreply.github.com>
|
2023-12-07 23:16:52 -08:00 |
|
Woosuk Kwon
|
0b98ba15c7
|
Change the name to vLLM (#150)
|
2023-06-17 03:07:40 -07:00 |
|
Woosuk Kwon
|
667ba3995c
|
Add copyright headers to source files adapted from FT (#104)
|
2023-05-14 22:19:19 -07:00 |
|
Woosuk Kwon
|
436e523bf1
|
Refactor attention kernels (#53)
|
2023-05-03 13:40:13 -07:00 |
|