20231088/vllm

Signed-off-by: reidliu41 <reid201711@gmail.com>
Co-authored-by: reidliu41 <reid201711@gmail.com>

2025-04-17 18:27:32 +08:00

775 B

Raw Blame History

(deployment-open-webui)=

Open WebUI

Install the (Docker)[https://docs.docker.com/engine/install/]
Start the vLLM server with the supported chat completion model, e.g.

vllm serve qwen/Qwen1.5-0.5B-Chat

Start the (Open WebUI)[https://github.com/open-webui/open-webui] docker container (replace the vllm serve host and vllm serve port):

docker run -d -p 3000:8080 \
--name open-webui \
-v open-webui:/app/backend/data \
-e OPENAI_API_BASE_URL=http://<vllm serve host>:<vllm serve port>/v1 \
--restart always \
ghcr.io/open-webui/open-webui:main

Open it in the browser: http://open-webui-host:3000/

On the top of the web page, you can see the model qwen/Qwen1.5-0.5B-Chat.

:::{image} /assets/deployment/open_webui.png :::

775 B Raw Blame History

Open WebUI

775 B

Raw Blame History