vllm/docs/source/deployment/integrations/llmaz.md

(deployment-llmaz)=

# llmaz

[llmaz](https://github.com/InftyAI/llmaz) is an easy-to-use and advanced inference platform for large language models on Kubernetes, aimed for production use. It uses vLLM as the default model serving backend.

Please refer to the [Quick Start](https://github.com/InftyAI/llmaz?tab=readme-ov-file#quick-start) for more details.
Add llmaz as another integration (#13643) Signed-off-by: kerthcet <kerthcet@gmail.com> 2025-02-21 11:52:40 +08:00			`(deployment-llmaz)=`

			`# llmaz`

			`[llmaz](https://github.com/InftyAI/llmaz) is an easy-to-use and advanced inference platform for large language models on Kubernetes, aimed for production use. It uses vLLM as the default model serving backend.`

			`Please refer to the [Quick Start](https://github.com/InftyAI/llmaz?tab=readme-ov-file#quick-start) for more details.`