vllm/docs/source/serving/deploying_with_docker.rst

.. _deploying_with_docker:

Deploying with Docker
============================

You can build and run vLLM from source via the provided dockerfile. To build vLLM:

.. code-block:: console

    $ DOCKER_BUILDKIT=1 docker build . --target vllm --tag vllm --build-arg max_jobs=8

To run vLLM:

.. code-block:: console

    $ docker run --runtime nvidia --gpus all \
        -v ~/.cache/huggingface:/root/.cache/huggingface \
        -p 8000:8000 \
        --env "HUGGING_FACE_HUB_TOKEN=<secret>" \
        vllm <args...>
Add Dockerfile (#1350) 2023-10-31 12:36:47 -07:00			`.. _deploying_with_docker:`

			`Deploying with Docker`
			`============================`

			`You can build and run vLLM from source via the provided dockerfile. To build vLLM:`

			`.. code-block:: console`

			`$ DOCKER_BUILDKIT=1 docker build . --target vllm --tag vllm --build-arg max_jobs=8`

			`To run vLLM:`

			`.. code-block:: console`

			`$ docker run --runtime nvidia --gpus all \`
			`-v ~/.cache/huggingface:/root/.cache/huggingface \`
			`-p 8000:8000 \`
			`--env "HUGGING_FACE_HUB_TOKEN=<secret>" \`
			`vllm <args...>`