20231088/vllm - vllm - Luminance Code Repo

20231088/vllm

Author	SHA1	Message	Date
youkaichao	241ad7b301	[ci] Fix sampler tests (#11922 ) Signed-off-by: youkaichao <youkaichao@gmail.com>	2025-01-10 20:45:33 +08:00
Harry Mellor	d85c47d6ad	Replace "online inference" with "online serving" (#11923 ) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-01-10 12:05:56 +00:00
wangxiyuan	ef725feafc	[platform] support pytorch custom op pluggable (#11328 ) Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>	2025-01-10 10:02:38 +00:00
cennn	d907be7dc7	[misc] remove python function call for custom activation op (#11885 ) Co-authored-by: youkaichao <youkaichao@gmail.com>	2025-01-10 17:18:25 +08:00
youkaichao	d53575a5f0	[ci] fix gh200 tests (#11919 ) Signed-off-by: youkaichao <youkaichao@gmail.com>	2025-01-10 16:25:17 +08:00
Kunshang Ji	61af633256	[BUGFIX] Fix `UnspecifiedPlatform` package name (#11916 ) Signed-off-by: Kunshang Ji <kunshang.ji@intel.com>	2025-01-10 16:20:46 +08:00
Joe Runde	ac2f3f7fee	[Bugfix] Validate lora adapters to avoid crashing server (#11727 ) Signed-off-by: Joe Runde <Joseph.Runde@ibm.com> Co-authored-by: Jee Jee Li <pandaleefree@gmail.com>	2025-01-10 15:56:36 +08:00
Chen Zhang	cf5f000d21	[torch.compile] Hide KV cache behind torch.compile boundary (#11677 ) Signed-off-by: Chen Zhang <zhangch99@outlook.com>	2025-01-10 13:14:42 +08:00
Cyrus Leung	3de2b1eafb	[Doc] Show default pooling method in a table (#11904 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-01-10 11:25:20 +08:00
Cyrus Leung	b844b99ad3	[VLM] Enable tokenized inputs for merged multi-modal processor (#11900 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-01-10 03:24:00 +00:00
Cyrus Leung	c3cf54dda4	[Doc][5/N] Move Community and API Reference to the bottom (#11896 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk> Co-authored-by: Simon Mo <simon.mo@hey.com>	2025-01-10 03:10:12 +00:00
Charles Frye	36f5303578	[Docs] Add Modal to deployment frameworks (#11907 )	2025-01-09 23:26:37 +00:00
Cyrus Leung	9a228348d2	[Misc] Provide correct Pixtral-HF chat template (#11891 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-01-09 10:19:37 -07:00
youkaichao	bd82872211	[ci]try to fix flaky multi-step tests (#11894 ) Signed-off-by: youkaichao <youkaichao@gmail.com>	2025-01-09 14:47:29 +00:00
wangxiyuan	405eb8e396	[platform] Allow platform specify attention backend (#11609 ) Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com> Signed-off-by: Mengqing Cao <cmq0113@163.com> Co-authored-by: Mengqing Cao <cmq0113@163.com>	2025-01-09 21:46:50 +08:00
Cyrus Leung	65097ca0af	[Doc] Add model development API Reference (#11884 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-01-09 09:43:40 +00:00
Ye (Charlotte) Qi	1d967acb45	[Bugfix] fix beam search input errors and latency benchmark script (#11875 ) Signed-off-by: Ye Qi <yeq@meta.com> Co-authored-by: yeq <yeq@devgpu004.lla3.facebook.com>	2025-01-09 17:36:39 +08:00
Cyrus Leung	0bd1ff4346	[Bugfix] Override dunder methods of placeholder modules (#11882 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-01-09 09:02:53 +00:00
youkaichao	310aca88c9	[perf]fix current stream (#11870 ) Signed-off-by: youkaichao <youkaichao@gmail.com>	2025-01-09 07:18:21 +00:00
Guspan Tanadi	a732900efc	[Doc] Intended links Python multiprocessing library (#11878 )	2025-01-09 05:39:39 +00:00
Cyrus Leung	d848800e88	[Misc] Move `print_*_once` from utils to logger (#11298 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk> Signed-off-by: Maxime Fournioux <55544262+mfournioux@users.noreply.github.com> Co-authored-by: Maxime Fournioux <55544262+mfournioux@users.noreply.github.com>	2025-01-09 12:48:12 +08:00
Michael Goin	730e9592e9	[Doc] Recommend uv and python 3.12 for quickstart guide (#11849 ) Signed-off-by: mgoin <michael@neuralmagic.com>	2025-01-09 11:37:48 +08:00
Maximilien de Bayser	1fe554bac3	treat do_lower_case in the same way as the sentence-transformers library (#11815 ) Signed-off-by: Max de Bayser <mbayser@br.ibm.com>	2025-01-09 11:05:43 +08:00
Tyler Michael Smith	615e4a5401	[CI] Turn on basic correctness tests for V1 (#10864 )	2025-01-08 21:20:44 -05:00
Simon Mo	3db0cafdf1	[Docs] Add Google Cloud Meetup (#11864 )	2025-01-08 12:38:28 -08:00
rasmith	526de822d5	[Kernel][Triton][AMD] Use block size heuristic for avg 2.8x speedup for int8 models (#11698 ) Signed-off-by: Randall Smith <Randall.Smith@amd.com>	2025-01-08 20:23:15 +00:00
Robert Shaw	56fe4c297c	[TPU][Quantization] TPU `W8A8` (#11785 ) Co-authored-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>	2025-01-08 19:33:29 +00:00
WangErXiao	47de8821d3	[Misc]add some explanations for BlockHashType (#11847 )	2025-01-08 18:21:30 +00:00
Cyrus Leung	5984499e47	[Doc] Expand Multimodal API Reference (#11852 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-01-08 17:14:14 +00:00
Cyrus Leung	ca47e176af	[Misc] Move some model utils into vision file (#11848 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-01-08 17:04:46 +00:00
Yan Ma	78f4590b60	[Bugfix][XPU] fix silu_and_mul (#11823 ) Signed-off-by: yan ma <yan.ma@intel.com>	2025-01-09 00:11:50 +08:00
Li, Jiang	2f7024987e	[CI/Build][Bugfix] Fix CPU CI image clean up (#11836 ) Signed-off-by: jiang1.li <jiang1.li@intel.com>	2025-01-08 15:18:28 +00:00
Cyrus Leung	6cd40a5bfe	[Doc][4/N] Reorganize API Reference (#11843 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-01-08 21:34:44 +08:00
Harry Mellor	aba8d6ee00	[Doc] Move examples into categories (#11840 ) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-01-08 13:09:53 +00:00
Cyrus Leung	2a0596bc48	[VLM] Reorganize profiling/processing-related code (#11812 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-01-08 18:59:58 +08:00
youkaichao	f12141170a	[torch.compile] consider relevant code in compilation cache (#11614 ) Signed-off-by: youkaichao <youkaichao@gmail.com>	2025-01-08 10:46:43 +00:00
Wallas Henrique	cfd3219f58	[Hardware][Apple] Native support for macOS Apple Silicon (#11696 ) Signed-off-by: Wallas Santos <wallashss@ibm.com> Co-authored-by: Michael Goin <michael@neuralmagic.com>	2025-01-08 16:35:49 +08:00
Simon Mo	a1b2b8606e	[Docs] Update sponsor name: 'Novita' to 'Novita AI' (#11833 )	2025-01-07 23:05:46 -08:00
youkaichao	ad9f1aa679	[doc] update wheels url (#11830 ) Signed-off-by: youkaichao <youkaichao@gmail.com>	2025-01-08 14:36:49 +08:00
youkaichao	889e662eae	[misc] improve memory profiling (#11809 ) Signed-off-by: youkaichao <youkaichao@gmail.com> Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>	2025-01-08 06:36:03 +00:00
Cyrus Leung	ef68eb28d8	[Bug] Fix pickling of `ModelConfig` when RunAI Model Streamer is used (#11825 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-01-08 13:40:09 +08:00
Simon Mo	259abd8953	[Docs] reorganize sponsorship page (#11639 ) Signed-off-by: simon-mo <simon.mo@hey.com>	2025-01-07 21:16:08 -08:00
Jee Jee Li	f645eb6954	[Bugfix] Add checks for LoRA and CPU offload (#11810 ) Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>	2025-01-08 13:08:48 +08:00
Ilya Lavrenov	f4923cb8bc	[OpenVINO] Fixed Docker.openvino build (#11732 ) Signed-off-by: Ilya Lavrenov <ilya.lavrenov@intel.com>	2025-01-08 13:08:30 +08:00
Nishidha	b640b19cc0	Fixed docker build for ppc64le (#11518 ) Signed-off-by: Nishidha Panpaliya <nishidha.panpaliya@partner.ibm.com>	2025-01-08 13:05:37 +08:00
WangErXiao	dc71af0a71	Remove the duplicate imports of MultiModalKwargs and PlaceholderRange… (#11824 )	2025-01-08 04:09:25 +00:00
Divakar Verma	4d29e91be8	[Misc] sort torch profiler table by kernel timing (#11813 )	2025-01-08 10:57:04 +08:00
Cyrus Leung	91445c7bc8	[Bugfix] Fix image input for Pixtral-HF (#11741 ) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>	2025-01-08 10:17:16 +08:00
Harry Mellor	5950f555a1	[Doc] Group examples into categories (#11782 ) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>	2025-01-08 09:20:12 +08:00
Jie Fu (傅杰)	a4e2b26856	[Bugfix] Significant performance drop on CPUs with --num-scheduler-steps > 1 (#11794 )	2025-01-07 16:15:50 -08:00

1 2 3 4 5 ...

4193 Commits