vllm/docs/source/deployment/integrations/kubeai.md

(deployment-kubeai)=

# KubeAI

[KubeAI](https://github.com/substratusai/kubeai) is a Kubernetes operator that enables you to deploy and manage AI models on Kubernetes. It provides a simple and scalable way to deploy vLLM in production. Functionality such as scale-from-zero, load based autoscaling, model caching, and much more is provided out of the box with zero external dependencies.

Please see the Installation Guides for environment specific instructions:

- [Any Kubernetes Cluster](https://www.kubeai.org/installation/any/)
- [EKS](https://www.kubeai.org/installation/eks/)
- [GKE](https://www.kubeai.org/installation/gke/)

Once you have KubeAI installed, you can
[configure text generation models](https://www.kubeai.org/how-to/configure-text-generation-models/)
using vLLM.
[Doc][3/N] Reorganize Serving section (#11766) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk> 2025-01-07 11:20:01 +08:00			`(deployment-kubeai)=`
[Docs] Convert rST to MyST (Markdown) (#11145) Signed-off-by: Rafael Vasquez <rafvasq21@gmail.com> 2024-12-23 17:35:38 -05:00
[Doc][3/N] Reorganize Serving section (#11766) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk> 2025-01-07 11:20:01 +08:00			`# KubeAI`
[Docs] Convert rST to MyST (Markdown) (#11145) Signed-off-by: Rafael Vasquez <rafvasq21@gmail.com> 2024-12-23 17:35:38 -05:00
			`[KubeAI](https://github.com/substratusai/kubeai) is a Kubernetes operator that enables you to deploy and manage AI models on Kubernetes. It provides a simple and scalable way to deploy vLLM in production. Functionality such as scale-from-zero, load based autoscaling, model caching, and much more is provided out of the box with zero external dependencies.`

			`Please see the Installation Guides for environment specific instructions:`

			`- [Any Kubernetes Cluster](https://www.kubeai.org/installation/any/)`
			`- [EKS](https://www.kubeai.org/installation/eks/)`
			`- [GKE](https://www.kubeai.org/installation/gke/)`

			`Once you have KubeAI installed, you can`
			`[configure text generation models](https://www.kubeai.org/how-to/configure-text-generation-models/)`
			`using vLLM.`