vllm/docs/source/serving/deploying_with_lws.rst

.. _deploying_with_lws:

Deploying with LWS
============================

LeaderWorkerSet (LWS) is a Kubernetes API that aims to address common deployment patterns of AI/ML inference workloads.
A major use case is for multi-host/multi-node distributed inference.

vLLM can be deployed with `LWS <https://github.com/kubernetes-sigs/lws>`_ on Kubernetes for distributed model serving.

Please see `this guide <https://github.com/kubernetes-sigs/lws/tree/main/docs/examples/vllm>`_ for more details on
deploying vLLM on Kubernetes using LWS.
Support to serve vLLM on Kubernetes with LWS (#4829) Signed-off-by: kerthcet <kerthcet@gmail.com> 2024-05-17 07:37:29 +08:00			`.. _deploying_with_lws:`

			`Deploying with LWS`
			`============================`

			`LeaderWorkerSet (LWS) is a Kubernetes API that aims to address common deployment patterns of AI/ML inference workloads.`
			`A major use case is for multi-host/multi-node distributed inference.`

			vLLM can be deployed with `LWS <https://github.com/kubernetes-sigs/lws>`_ on Kubernetes for distributed model serving.

			Please see `this guide <https://github.com/kubernetes-sigs/lws/tree/main/docs/examples/vllm>`_ for more details on
			`deploying vLLM on Kubernetes using LWS.`