Roger Wang
|
e6e42e4b17
|
[Core][VLM] Support image embeddings as input (#6613)
|
2024-08-12 16:16:06 +08:00 |
|
Simon Mo
|
f020a6297e
|
[Docs] Update readme (#7316)
|
2024-08-11 17:13:37 -07:00 |
|
tomeras91
|
02b1988b9f
|
[Doc] building vLLM with VLLM_TARGET_DEVICE=empty (#7403)
|
2024-08-11 14:38:17 -07:00 |
|
Woosuk Kwon
|
90bab18f24
|
[TPU] Use mark_dynamic to reduce compilation time (#7340)
|
2024-08-10 18:12:22 -07:00 |
|
Simon Mo
|
5923532e15
|
Add Skywork AI as Sponsor (#7314)
|
2024-08-08 13:59:57 -07:00 |
|
Jee Jee Li
|
757ac70a64
|
[Model] Rename MiniCPMVQwen2 to MiniCPMV2.6 (#7273)
|
2024-08-08 14:02:41 +00:00 |
|
Michael Goin
|
6d94420246
|
[Doc] Update supported_hardware.rst (#7276)
|
2024-08-07 14:21:50 -07:00 |
|
Stas Bekman
|
0e12cd67a8
|
[Doc] add online speculative decoding example (#7243)
|
2024-08-07 09:58:02 -07:00 |
|
Ilya Lavrenov
|
80cbe10c59
|
[OpenVINO] migrate to latest dependencies versions (#7251)
|
2024-08-07 09:49:10 -07:00 |
|
Roger Wang
|
2385c8f374
|
[Doc] Mock new dependencies for documentation (#7245)
|
2024-08-07 06:43:03 +00:00 |
|
Thomas Parnell
|
789937af2e
|
[Doc] [SpecDecode] Update MLPSpeculator documentation (#7100)
Signed-off-by: Thomas Parnell <tpa@zurich.ibm.com>
|
2024-08-05 23:29:43 +00:00 |
|
Simon Mo
|
4db5176d97
|
bump version to v0.5.4 (#7139)
|
2024-08-05 14:39:48 -07:00 |
|
Jee Jee Li
|
179a6a36f2
|
[Model]Refactor MiniCPMV (#7020)
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
|
2024-08-04 08:12:41 +00:00 |
|
Yihuan Bu
|
654bc5ca49
|
Support for guided decoding for offline LLM (#6878)
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
|
2024-08-04 03:12:09 +00:00 |
|
Michael Goin
|
b482b9a5b1
|
[CI/Build] Add support for Python 3.12 (#7035)
|
2024-08-02 13:51:22 -07:00 |
|
Murali Andoorveedu
|
fc912e0886
|
[Models] Support Qwen model with PP (#6974)
Signed-off-by: Muralidhar Andoorveedu <muralidhar.andoorveedu@centml.ai>
|
2024-08-01 12:40:43 -07:00 |
|
Jee Jee Li
|
7ecee34321
|
[Kernel][RFC] Refactor the punica kernel based on Triton (#5036)
|
2024-07-31 17:12:24 -07:00 |
|
Alphi
|
2f4e108f75
|
[Bugfix] Clean up MiniCPM-V (#6939)
Co-authored-by: hezhihui <hzh7269@modelbest.cn>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
|
2024-07-31 14:39:19 +00:00 |
|
Cyrus Leung
|
f230cc2ca6
|
[Bugfix] Fix broadcasting logic for multi_modal_kwargs (#6836)
|
2024-07-31 10:38:45 +08:00 |
|
Ilya Lavrenov
|
5895b24677
|
[OpenVINO] Updated OpenVINO requirements and build docs (#6948)
|
2024-07-30 11:33:01 -07:00 |
|
Isotr0py
|
7cbd9ec7a9
|
[Model] Initialize support for InternVL2 series models (#6514)
Co-authored-by: Roger Wang <ywang@roblox.com>
|
2024-07-29 10:16:30 +00:00 |
|
Woosuk Kwon
|
fad5576c58
|
[TPU] Reduce compilation time & Upgrade PyTorch XLA version (#6856)
|
2024-07-27 10:28:33 -07:00 |
|
Chenggang Wu
|
f954d0715c
|
[Docs] Add RunLLM chat widget (#6857)
|
2024-07-27 09:24:46 -07:00 |
|
Cyrus Leung
|
1ad86acf17
|
[Model] Initial support for BLIP-2 (#5920)
Co-authored-by: ywang96 <ywang@roblox.com>
|
2024-07-27 11:53:07 +00:00 |
|
Roger Wang
|
ecb33a28cb
|
[CI/Build][Doc] Update CI and Doc for VLM example changes (#6860)
|
2024-07-27 09:54:14 +00:00 |
|
Harry Mellor
|
c53041ae3b
|
[Doc] Add missing mock import to docs conf.py (#6834)
|
2024-07-27 04:47:33 +00:00 |
|
omrishiv
|
3c3012398e
|
[Doc] add VLLM_TARGET_DEVICE=neuron to documentation for neuron (#6844)
Signed-off-by: omrishiv <327609+omrishiv@users.noreply.github.com>
|
2024-07-26 20:20:16 -07:00 |
|
Woosuk Kwon
|
ced36cd89b
|
[ROCm] Upgrade PyTorch nightly version (#6845)
|
2024-07-26 20:16:13 -07:00 |
|
Zhanghao Wu
|
150a1ffbfd
|
[Doc] Update SkyPilot doc for wrong indents and instructions for update service (#4283)
|
2024-07-26 14:39:10 -07:00 |
|
Michael Goin
|
281977bd6e
|
[Doc] Add Nemotron to supported model docs (#6843)
|
2024-07-26 17:32:44 -04:00 |
|
Li, Jiang
|
3bbb4936dc
|
[Hardware] [Intel] Enable Multiprocessing and tensor parallel in CPU backend and update documentation (#6125)
|
2024-07-26 13:50:10 -07:00 |
|
youkaichao
|
85ad7e2d01
|
[doc][debugging] add known issues for hangs (#6816)
|
2024-07-25 21:48:05 -07:00 |
|
Woosuk Kwon
|
b7215de2c5
|
[Docs] Publish 5th meetup slides (#6799)
|
2024-07-25 16:47:55 -07:00 |
|
youkaichao
|
f3ff63c3f4
|
[doc][distributed] improve multinode serving doc (#6804)
|
2024-07-25 15:38:32 -07:00 |
|
Kuntai Du
|
6a1e25b151
|
[Doc] Add documentations for nightly benchmarks (#6412)
|
2024-07-25 11:57:16 -07:00 |
|
Alphi
|
9e169a4c61
|
[Model] Adding support for MiniCPM-V (#4087)
|
2024-07-24 20:59:30 -07:00 |
|
Hongxia Yang
|
d88c458f44
|
[Doc][AMD][ROCm]Added tips to refer to mi300x tuning guide for mi300x users (#6754)
|
2024-07-24 14:32:57 -07:00 |
|
Woosuk Kwon
|
ccc4a73257
|
[Docs][ROCm] Detailed instructions to build from source (#6680)
|
2024-07-24 01:07:23 -07:00 |
|
dongmao zhang
|
87525fab92
|
[bitsandbytes]: support read bnb pre-quantized model (#5753)
Co-authored-by: Michael Goin <michael@neuralmagic.com>
|
2024-07-23 23:45:09 +00:00 |
|
youkaichao
|
71950af726
|
[doc][distributed] fix doc argument order (#6691)
|
2024-07-23 08:55:33 -07:00 |
|
Woosuk Kwon
|
cb1362a889
|
[Docs] Announce llama3.1 support (#6688)
|
2024-07-23 08:18:15 -07:00 |
|
Roger Wang
|
22fa2e35cb
|
[VLM][Model] Support image input for Chameleon (#6633)
|
2024-07-22 23:50:48 -07:00 |
|
youkaichao
|
c051bfe4eb
|
[doc][distributed] doc for setting up multi-node environment (#6529)
[doc][distributed] add more doc for setting up multi-node environment (#6529)
|
2024-07-22 21:22:09 -07:00 |
|
Cyrus Leung
|
739b61a348
|
[Frontend] Refactor prompt processing (#4028)
Co-authored-by: Roger Wang <ywang@roblox.com>
|
2024-07-22 10:13:53 -07:00 |
|
Matt Wong
|
06d6c5fe9f
|
[Bugfix][CI/Build][Hardware][AMD] Fix AMD tests, add HF cache, update CK FA, add partially supported model notes (#6543)
|
2024-07-20 09:39:07 -07:00 |
|
Murali Andoorveedu
|
45ceb85a0c
|
[Docs] Update PP docs (#6598)
|
2024-07-19 16:38:21 -07:00 |
|
Simon Mo
|
30efe41532
|
[Docs] Update docs for wheel location (#6580)
|
2024-07-19 12:14:11 -07:00 |
|
milo157
|
a38524f338
|
[DOC] - Add docker image to Cerebrium Integration (#6510)
|
2024-07-17 10:22:53 -07:00 |
|
Cyrus Leung
|
5bf35a91e4
|
[Doc][CI/Build] Update docs and tests to use vllm serve (#6431)
|
2024-07-17 07:43:21 +00:00 |
|
Hongxia Yang
|
10383887e0
|
[ROCm] Cleanup Dockerfile and remove outdated patch (#6482)
|
2024-07-16 22:47:02 -07:00 |
|