Michael Goin
|
5f6d10c14c
|
[CI/Build] Enforce style for C++ and CUDA code with clang-format (#4722)
|
2024-05-22 07:18:41 +00:00 |
|
Sanger Steel
|
8bc68e198c
|
[Frontend] [Core] perf: Automatically detect vLLM-tensorized model, update tensorizer to version 2.9.0 (#4208)
|
2024-05-13 14:57:07 -07:00 |
|
Chang Su
|
e254497b66
|
[Model][Misc] Add e5-mistral-7b-instruct and Embedding API (#3734)
|
2024-05-11 11:30:37 -07:00 |
|
Alpay Ariyak
|
715c2d854d
|
[Frontend] [Core] Tensorizer: support dynamic num_readers , update version (#4467)
|
2024-04-30 16:32:13 -07:00 |
|
Cyrus Leung
|
8947bc3c15
|
[Frontend][Bugfix] Disallow extra fields in OpenAI API (#4355)
|
2024-04-27 05:08:24 +00:00 |
|
Isotr0py
|
fbf152d976
|
[Bugfix][Model] Refactor OLMo model to support new HF format in transformers 4.40.0 (#4324)
Co-authored-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
|
2024-04-25 09:35:56 -07:00 |
|
Sanger Steel
|
711a000255
|
[Frontend] [Core] feat: Add model loading using tensorizer (#3476)
|
2024-04-13 17:13:01 -07:00 |
|
SangBin Cho
|
09473ee41c
|
[mypy] Add mypy type annotation part 1 (#4006)
|
2024-04-12 14:35:50 -07:00 |
|
SangBin Cho
|
26422e477b
|
[Test] Make model tests run again and remove --forked from pytest (#3631)
Co-authored-by: Simon Mo <simon.mo@hey.com>
|
2024-03-28 21:06:40 -07:00 |
|
xwjiang2010
|
64172a976c
|
[Feature] Add vision language model support. (#3042)
|
2024-03-25 14:16:30 -07:00 |
|
SangBin Cho
|
01bfb22b41
|
[CI] Try introducing isort. (#3495)
|
2024-03-25 07:59:47 -07:00 |
|
Simon Mo
|
93348d9458
|
[CI] Shard tests for LoRA and Kernels to speed up (#3445)
|
2024-03-17 14:56:30 -07:00 |
|
Terry
|
0bba88df03
|
Enhance lora tests with more layer and rank variations (#3243)
|
2024-03-09 17:14:16 -08:00 |
|
Chen Wang
|
cbf4c05b15
|
Update requirements-dev.txt to include package for benchmarking scripts. (#3181)
Co-authored-by: Zhuohan Li <zhuohan123@gmail.com>
|
2024-03-07 08:39:28 +00:00 |
|
Robert Shaw
|
c0c2335ce0
|
Integrate Marlin Kernels for Int4 GPTQ inference (#2497)
Co-authored-by: Robert Shaw <114415538+rib-2@users.noreply.github.com>
Co-authored-by: alexm <alexm@neuralmagic.com>
|
2024-03-01 12:47:51 -08:00 |
|
Woosuk Kwon
|
6f32cddf1c
|
Remove Flash Attention in test env (#2982)
|
2024-02-22 09:58:29 -08:00 |
|
Massimiliano Pronesti
|
93dc5a2870
|
chore(vllm): codespell for spell checking (#2820)
|
2024-02-21 18:56:01 -08:00 |
|
FlorianJoncour
|
14cc317ba4
|
OpenAI Server refactoring (#2360)
|
2024-01-16 21:33:14 -08:00 |
|
Simon Mo
|
6e01e8c1c8
|
[CI] Add Buildkite (#2355)
|
2024-01-14 12:37:58 -08:00 |
|
mezuzza
|
6774bd50b0
|
Fix typing in AsyncLLMEngine & add toml to requirements-dev (#2100)
|
2023-12-14 00:19:41 -08:00 |
|
Simon Mo
|
5ffc0d13a2
|
Migrate linter from pylint to ruff (#1665)
|
2023-11-20 11:58:01 -08:00 |
|
Stephen Krider
|
9cabcb7645
|
Add Dockerfile (#1350)
|
2023-10-31 12:36:47 -07:00 |
|
Antoni Baum
|
ff36139ffc
|
Remove AsyncLLMEngine busy loop, shield background task (#1059)
|
2023-09-17 00:29:08 -07:00 |
|
Woosuk Kwon
|
32b6816e55
|
Add tests for models (#922)
|
2023-09-01 11:19:43 +09:00 |
|
Zhuohan Li
|
d6fa1be3a8
|
[Quality] Add code formatter and linter (#326)
|
2023-07-03 11:31:55 -07:00 |
|
Woosuk Kwon
|
a283ec2eec
|
Add contributing guideline and mypy config (#122)
|
2023-05-23 17:58:51 -07:00 |
|