20231088/vllm - vllm - Luminance Code Repo

20231088/vllm

Author	SHA1	Message	Date
tomeras91	d2b1bf55ec	[Frontend][Feature] Add jamba tool parser (#9154 )	2024-10-18 10:27:48 +00:00
Wallas Henrique	8baf85e4e9	[Doc] Compatibility matrix for mutual exclusive features (#8512 ) Signed-off-by: Wallas Santos <wallashss@ibm.com>	2024-10-11 11:18:50 -07:00
Yuan Tang	acce7630c1	Update link to KServe deployment guide (#9173 )	2024-10-09 03:58:49 +00:00
TimWang	93cf74a8a7	[Doc]: Add deploying_with_k8s guide (#8451 )	2024-10-07 13:31:45 -07:00
Andy Dai	5df1834895	[Bugfix] Fix order of arguments matters in config.yaml (#8960 )	2024-10-05 17:35:11 +00:00
代君	3dbb215b38	[Frontend][Feature] support tool calling for internlm/internlm2_5-7b-chat model (#8405 )	2024-10-04 10:36:39 +08:00
Maximilien de Bayser	344cd2b6f4	[Feature] Add support for Llama 3.1 and 3.2 tool use (#8343 ) Signed-off-by: Max de Bayser <mbayser@br.ibm.com>	2024-09-26 17:01:42 -07:00
sroy745	2febcf2777	[Documentation][Spec Decode] Add documentation about lossless guarantees in Speculative Decoding in vLLM (#7962 )	2024-09-05 16:25:29 -04:00
Kyle Mistele	e02ce498be	[Feature] OpenAI-Compatible Tools API + Streaming for Hermes & Mistral models (#5649 ) Co-authored-by: constellate <constellate@1-ai-appserver-staging.codereach.com> Co-authored-by: Kyle Mistele <kyle@constellate.ai>	2024-09-04 13:18:13 -07:00
Kaunil Dhruv	058344f89a	[Frontend]-config-cli-args (#7737 ) Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com> Co-authored-by: Kaunil Dhruv <kaunil_dhruv@intuit.com>	2024-08-30 08:21:02 -07:00
Kameshwara Pavan Kumar Mantha	22b39e11f2	llama_index serving integration documentation (#6973 ) Co-authored-by: pavanmantha <pavan.mantha@thevaslabs.io>	2024-08-14 15:38:37 -07:00
Murali Andoorveedu	fc912e0886	[Models] Support Qwen model with PP (#6974 ) Signed-off-by: Muralidhar Andoorveedu <muralidhar.andoorveedu@centml.ai>	2024-08-01 12:40:43 -07:00
Zhanghao Wu	150a1ffbfd	[Doc] Update SkyPilot doc for wrong indents and instructions for update service (#4283 )	2024-07-26 14:39:10 -07:00
youkaichao	f3ff63c3f4	[doc][distributed] improve multinode serving doc (#6804 )	2024-07-25 15:38:32 -07:00
youkaichao	71950af726	[doc][distributed] fix doc argument order (#6691 )	2024-07-23 08:55:33 -07:00
youkaichao	c051bfe4eb	[doc][distributed] doc for setting up multi-node environment (#6529 ) [doc][distributed] add more doc for setting up multi-node environment (#6529)	2024-07-22 21:22:09 -07:00
Murali Andoorveedu	45ceb85a0c	[Docs] Update PP docs (#6598 )	2024-07-19 16:38:21 -07:00
milo157	a38524f338	[DOC] - Add docker image to Cerebrium Integration (#6510 )	2024-07-17 10:22:53 -07:00
Cyrus Leung	5bf35a91e4	[Doc][CI/Build] Update docs and tests to use `vllm serve` (#6431 )	2024-07-17 07:43:21 +00:00
youkaichao	94b82e8c18	[doc][distributed] add suggestion for distributed inference (#6418 )	2024-07-15 09:45:51 -07:00
Ethan Xu	dbfe254eda	[Feature] vLLM CLI (#5090 ) Co-authored-by: simon-mo <simon.mo@hey.com>	2024-07-14 15:36:43 -07:00
Murali Andoorveedu	673dd4cae9	[Docs] Docs update for Pipeline Parallel (#6222 ) Signed-off-by: Muralidhar Andoorveedu <muralidhar.andoorveedu@centml.ai> Co-authored-by: Simon Mo <simon.mo@hey.com>	2024-07-09 16:24:58 -07:00
Roger Wang	8e0817c262	[Bugfix][Doc] Fix Doc Formatting (#6048 )	2024-07-01 15:09:11 -07:00
ning.zhang	83bdcb6ac3	add FAQ doc under 'serving' (#5946 )	2024-07-01 14:11:36 -07:00
youkaichao	4050d646e5	[doc][misc] remove deprecated api server in doc (#6037 )	2024-07-01 12:52:43 -04:00
youkaichao	3fd02bda51	[doc][misc] add note for Kubernetes users (#5916 )	2024-06-27 10:07:07 -07:00
youkaichao	294104c3f9	[doc] update usage of env var to avoid conflict (#5873 )	2024-06-26 17:57:12 -04:00
youkaichao	c246212952	[doc][faq] add warning to download models for every nodes (#5783 )	2024-06-24 15:37:42 +08:00
Rafael Vasquez	e83db9e7e3	[Doc] Update docker references (#5614 ) Signed-off-by: Rafael Vasquez <rafvasq21@gmail.com>	2024-06-19 15:01:45 -07:00
milo157	2bd231a7b7	[Doc] Added cerebrium as Integration option (#5553 )	2024-06-18 15:56:59 -07:00
Sanger Steel	6e2527a7cb	[Doc] Update documentation on Tensorizer (#5471 )	2024-06-14 11:27:57 -07:00
Nick Hill	99dac099ab	[Core][Doc] Default to multiprocessing for single-node distributed case (#5230 ) Co-authored-by: Antoni Baum <antoni.baum@protonmail.com>	2024-06-11 11:10:41 -07:00
Roger Wang	7a9cb294ae	[Frontend] Add OpenAI Vision API Support (#5237 ) Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>	2024-06-07 11:23:32 -07:00
Breno Faria	f775a07e30	[FRONTEND] OpenAI `tools` support named functions (#5032 )	2024-06-03 18:25:29 -05:00
Chansung Park	429d89720e	add doc about serving option on dstack (#3074 ) Co-authored-by: Roger Wang <ywang@roblox.com>	2024-05-30 10:11:07 -07:00
youkaichao	4fbcb0f27e	[Doc][Build] update after removing vllm-nccl (#5103 ) Co-authored-by: Roger Wang <136131678+ywang96@users.noreply.github.com>	2024-05-29 23:51:18 +00:00
Cyrus Leung	5ae5ed1e60	[Core] Consolidate prompt arguments to LLM engines (#4328 ) Co-authored-by: Roger Wang <ywang@roblox.com>	2024-05-28 13:29:31 -07:00
Kante Yin	8e7fb5d43a	Support to serve vLLM on Kubernetes with LWS (#4829 ) Signed-off-by: kerthcet <kerthcet@gmail.com>	2024-05-16 16:37:29 -07:00
Cyrus Leung	4bfa7e7f75	[Doc] Add API reference for offline inference (#4710 )	2024-05-13 17:47:42 -07:00
Cyrus Leung	a3c124570a	[Bugfix] Fix CLI arguments in OpenAI server docs (#4709 )	2024-05-09 09:53:14 -07:00
youkaichao	2d7bce9cd5	[Doc] add env vars to the doc (#4572 )	2024-05-03 05:13:49 +00:00
Frαnçois	e491c7e053	[Doc] update(example model): for OpenAI compatible serving (#4503 )	2024-05-01 10:14:16 -07:00
youkaichao	2768884ac4	[Doc] Add note for docker user (#4340 ) Co-authored-by: Simon Mo <simon.mo@hey.com>	2024-04-24 21:09:44 +00:00
Zhanghao Wu	ceaf4ed003	[Doc] Update the SkyPilot doc with serving and Llama-3 (#4276 )	2024-04-22 15:34:31 -07:00
Frαnçois	92cd2e2f21	[Doc] Fix getting stared to use publicly available model (#3963 )	2024-04-10 18:05:52 +00:00
yhu422	d8658c8cc1	Usage Stats Collection (#2852 )	2024-03-28 22:16:12 -07:00
Simon Mo	ef65dcfa6f	[Doc] Add docs about OpenAI compatible server (#3288 )	2024-03-18 22:05:34 -07:00
Sherlock Xu	b0925b3878	docs: Add BentoML deployment doc (#3336 ) Signed-off-by: Sherlock113 <sherlockxu07@gmail.com>	2024-03-12 10:34:30 -07:00
Yuan Tang	49d849b3ab	docs: Add tutorial on deploying vLLM model with KServe (#2586 ) Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>	2024-03-01 11:04:14 -08:00
Massimiliano Pronesti	5ed704ec8c	docs: fix langchain (#2736 )	2024-02-03 18:17:55 -08:00

1 2

63 Commits