[CI/Build] improve python-only dev setup (#9621)
Signed-off-by: Daniele Trifirò <dtrifiro@redhat.com> Signed-off-by: youkaichao <youkaichao@gmail.com> Co-authored-by: youkaichao <youkaichao@gmail.com>
This commit is contained in:
parent
82eb5ea8f3
commit
e4c34c23de
@ -21,7 +21,7 @@ You can install vLLM using pip:
|
||||
.. code-block:: console
|
||||
|
||||
$ # (Recommended) Create a new conda environment.
|
||||
$ conda create -n myenv python=3.10 -y
|
||||
$ conda create -n myenv python=3.12 -y
|
||||
$ conda activate myenv
|
||||
|
||||
$ # Install vLLM with CUDA 12.1.
|
||||
@ -89,45 +89,24 @@ Build from source
|
||||
Python-only build (without compilation)
|
||||
---------------------------------------
|
||||
|
||||
If you only need to change Python code, you can simply build vLLM without compilation.
|
||||
|
||||
The first step is to install the latest vLLM wheel:
|
||||
|
||||
.. code-block:: console
|
||||
|
||||
pip install https://vllm-wheels.s3.us-west-2.amazonaws.com/nightly/vllm-1.0.0.dev-cp38-abi3-manylinux1_x86_64.whl
|
||||
|
||||
You can find more information about vLLM's wheels `above <#install-the-latest-code>`_.
|
||||
|
||||
After verifying that the installation is successful, you can use `the following script <https://github.com/vllm-project/vllm/blob/main/python_only_dev.py>`_:
|
||||
If you only need to change Python code, you can build and install vLLM without compilation. Using `pip's ``--editable`` flag <https://pip.pypa.io/en/stable/topics/local-project-installs/#editable-installs>`_, changes you make to the code will be reflected when you run vLLM:
|
||||
|
||||
.. code-block:: console
|
||||
|
||||
$ git clone https://github.com/vllm-project/vllm.git
|
||||
$ cd vllm
|
||||
$ python python_only_dev.py
|
||||
$ VLLM_USE_PRECOMPILED=1 pip install --editable .
|
||||
|
||||
The script will:
|
||||
This will download the latest nightly wheel and use the compiled libraries from there in the install.
|
||||
|
||||
* Find the installed vLLM package in the current environment.
|
||||
* Copy built files to the current directory.
|
||||
* Rename the installed vLLM package.
|
||||
* Symbolically link the current directory to the installed vLLM package.
|
||||
|
||||
Now, you can edit the Python code in the current directory, and the changes will be reflected when you run vLLM.
|
||||
|
||||
Once you have finished editing or want to install another vLLM wheel, you should exit the development environment using `the same script <https://github.com/vllm-project/vllm/blob/main/python_only_dev.py>`_ with the ``--quit-dev`` (or ``-q`` for short) flag:
|
||||
The ``VLLM_PRECOMPILED_WHEEL_LOCATION`` environment variable can be used instead of ``VLLM_USE_PRECOMPILED`` to specify a custom path or URL to the wheel file. For example, to use the `0.6.1.post1 PyPi wheel <https://pypi.org/project/vllm/#files>`_:
|
||||
|
||||
.. code-block:: console
|
||||
|
||||
$ python python_only_dev.py --quit-dev
|
||||
$ export VLLM_PRECOMPILED_WHEEL_LOCATION=https://files.pythonhosted.org/packages/4a/4c/ee65ba33467a4c0de350ce29fbae39b9d0e7fcd887cc756fa993654d1228/vllm-0.6.3.post1-cp38-abi3-manylinux1_x86_64.whl
|
||||
$ pip install --editable .
|
||||
|
||||
The ``--quit-dev`` flag will:
|
||||
|
||||
* Remove the symbolic link from the current directory to the vLLM package.
|
||||
* Restore the original vLLM package from the backup.
|
||||
|
||||
If you update the vLLM wheel and rebuild from the source to make further edits, you will need to repeat the `Python-only build <#python-only-build>`_ steps again.
|
||||
You can find more information about vLLM's wheels `above <#install-the-latest-code>`_.
|
||||
|
||||
.. note::
|
||||
|
||||
@ -148,9 +127,13 @@ If you want to modify C++ or CUDA code, you'll need to build vLLM from source. T
|
||||
.. tip::
|
||||
|
||||
Building from source requires a lot of compilation. If you are building from source repeatedly, it's more efficient to cache the compilation results.
|
||||
|
||||
For example, you can install `ccache <https://github.com/ccache/ccache>`_ using ``conda install ccache`` or ``apt install ccache`` .
|
||||
As long as ``which ccache`` command can find the ``ccache`` binary, it will be used automatically by the build system. After the first build, subsequent builds will be much faster.
|
||||
|
||||
`sccache <https://github.com/mozilla/sccache>`_ works similarly to ``ccache``, but has the capability to utilize caching in remote storage environments.
|
||||
The following environment variables can be set to configure the vLLM ``sccache`` remote: ``SCCACHE_BUCKET=vllm-build-sccache SCCACHE_REGION=us-west-2 SCCACHE_S3_NO_CREDENTIALS=1``. We also recommend setting ``SCCACHE_IDLE_TIMEOUT=0``.
|
||||
|
||||
|
||||
Use an existing PyTorch installation
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
@ -1,92 +1,14 @@
|
||||
# enable python only development
|
||||
# copy compiled files to the current directory directly
|
||||
msg = """Old style python only build (without compilation) is deprecated, please check https://docs.vllm.ai/en/latest/getting_started/installation.html#python-only-build-without-compilation for the new way to do python only build (without compilation).
|
||||
|
||||
import argparse
|
||||
import os
|
||||
import shutil
|
||||
import subprocess
|
||||
import sys
|
||||
import warnings
|
||||
TL;DR:
|
||||
|
||||
parser = argparse.ArgumentParser(
|
||||
description="Development mode for python-only code")
|
||||
parser.add_argument('-q',
|
||||
'--quit-dev',
|
||||
action='store_true',
|
||||
help='Set the flag to quit development mode')
|
||||
args = parser.parse_args()
|
||||
VLLM_USE_PRECOMPILED=1 pip install -e .
|
||||
|
||||
# cannot directly `import vllm` , because it will try to
|
||||
# import from the current directory
|
||||
output = subprocess.run([sys.executable, "-m", "pip", "show", "vllm"],
|
||||
capture_output=True)
|
||||
or
|
||||
|
||||
assert output.returncode == 0, "vllm is not installed"
|
||||
export VLLM_COMMIT=33f460b17a54acb3b6cc0b03f4a17876cff5eafd # use full commit hash from the main branch
|
||||
export VLLM_PRECOMPILED_WHEEL_LOCATION=https://vllm-wheels.s3.us-west-2.amazonaws.com/${VLLM_COMMIT}/vllm-1.0.0.dev-cp38-abi3-manylinux1_x86_64.whl
|
||||
pip install -e .
|
||||
""" # noqa
|
||||
|
||||
text = output.stdout.decode("utf-8")
|
||||
|
||||
package_path = None
|
||||
for line in text.split("\n"):
|
||||
if line.startswith("Location: "):
|
||||
package_path = line.split(": ")[1]
|
||||
break
|
||||
|
||||
assert package_path is not None, "could not find package path"
|
||||
|
||||
cwd = os.getcwd()
|
||||
|
||||
assert cwd != package_path, "should not import from the current directory"
|
||||
|
||||
files_to_copy = [
|
||||
"vllm/_C.abi3.so",
|
||||
"vllm/_moe_C.abi3.so",
|
||||
"vllm/vllm_flash_attn/vllm_flash_attn_c.abi3.so",
|
||||
"vllm/vllm_flash_attn/flash_attn_interface.py",
|
||||
"vllm/vllm_flash_attn/__init__.py",
|
||||
# "vllm/_version.py", # not available in nightly wheels yet
|
||||
]
|
||||
|
||||
# Try to create _version.py to avoid version related warning
|
||||
# Refer to https://github.com/vllm-project/vllm/pull/8771
|
||||
try:
|
||||
from setuptools_scm import get_version
|
||||
get_version(write_to="vllm/_version.py")
|
||||
except ImportError:
|
||||
warnings.warn(
|
||||
"To avoid warnings related to vllm._version, "
|
||||
"you should install setuptools-scm by `pip install setuptools-scm`",
|
||||
stacklevel=2)
|
||||
|
||||
if not args.quit_dev:
|
||||
for file in files_to_copy:
|
||||
src = os.path.join(package_path, file)
|
||||
dst = file
|
||||
print(f"Copying {src} to {dst}")
|
||||
shutil.copyfile(src, dst)
|
||||
|
||||
pre_built_vllm_path = os.path.join(package_path, "vllm")
|
||||
tmp_path = os.path.join(package_path, "vllm_pre_built")
|
||||
current_vllm_path = os.path.join(cwd, "vllm")
|
||||
|
||||
print(f"Renaming {pre_built_vllm_path} to {tmp_path} for backup")
|
||||
shutil.copytree(pre_built_vllm_path, tmp_path)
|
||||
shutil.rmtree(pre_built_vllm_path)
|
||||
|
||||
print(f"Linking {current_vllm_path} to {pre_built_vllm_path}")
|
||||
os.symlink(current_vllm_path, pre_built_vllm_path)
|
||||
else:
|
||||
vllm_symlink_path = os.path.join(package_path, "vllm")
|
||||
vllm_backup_path = os.path.join(package_path, "vllm_pre_built")
|
||||
current_vllm_path = os.path.join(cwd, "vllm")
|
||||
|
||||
print(f"Unlinking {current_vllm_path} to {vllm_symlink_path}")
|
||||
assert os.path.islink(
|
||||
vllm_symlink_path
|
||||
), f"not in dev mode: {vllm_symlink_path} is not a symbolic link"
|
||||
assert current_vllm_path == os.readlink(
|
||||
vllm_symlink_path
|
||||
), "current directory is not the source code of package"
|
||||
os.unlink(vllm_symlink_path)
|
||||
|
||||
print(f"Recovering backup from {vllm_backup_path} to {vllm_symlink_path}")
|
||||
os.rename(vllm_backup_path, vllm_symlink_path)
|
||||
print(msg)
|
||||
|
83
setup.py
83
setup.py
@ -249,6 +249,74 @@ class cmake_build_ext(build_ext):
|
||||
self.copy_file(file, dst_file)
|
||||
|
||||
|
||||
class repackage_wheel(build_ext):
|
||||
"""Extracts libraries and other files from an existing wheel."""
|
||||
default_wheel = "https://vllm-wheels.s3.us-west-2.amazonaws.com/nightly/vllm-1.0.0.dev-cp38-abi3-manylinux1_x86_64.whl"
|
||||
|
||||
def run(self) -> None:
|
||||
wheel_location = os.getenv("VLLM_PRECOMPILED_WHEEL_LOCATION",
|
||||
self.default_wheel)
|
||||
|
||||
assert _is_cuda(
|
||||
), "VLLM_USE_PRECOMPILED is only supported for CUDA builds"
|
||||
|
||||
import zipfile
|
||||
|
||||
if os.path.isfile(wheel_location):
|
||||
wheel_path = wheel_location
|
||||
print(f"Using existing wheel={wheel_path}")
|
||||
else:
|
||||
# Download the wheel from a given URL, assume
|
||||
# the filename is the last part of the URL
|
||||
wheel_filename = wheel_location.split("/")[-1]
|
||||
|
||||
import tempfile
|
||||
|
||||
# create a temporary directory to store the wheel
|
||||
temp_dir = tempfile.mkdtemp(prefix="vllm-wheels")
|
||||
wheel_path = os.path.join(temp_dir, wheel_filename)
|
||||
|
||||
print(f"Downloading wheel from {wheel_location} to {wheel_path}")
|
||||
|
||||
from urllib.request import urlretrieve
|
||||
|
||||
try:
|
||||
urlretrieve(wheel_location, filename=wheel_path)
|
||||
except Exception as e:
|
||||
from setuptools.errors import SetupError
|
||||
|
||||
raise SetupError(
|
||||
f"Failed to get vLLM wheel from {wheel_location}") from e
|
||||
|
||||
with zipfile.ZipFile(wheel_path) as wheel:
|
||||
files_to_copy = [
|
||||
"vllm/_C.abi3.so",
|
||||
"vllm/_moe_C.abi3.so",
|
||||
"vllm/vllm_flash_attn/vllm_flash_attn_c.abi3.so",
|
||||
"vllm/vllm_flash_attn/flash_attn_interface.py",
|
||||
"vllm/vllm_flash_attn/__init__.py",
|
||||
# "vllm/_version.py", # not available in nightly wheels yet
|
||||
]
|
||||
file_members = filter(lambda x: x.filename in files_to_copy,
|
||||
wheel.filelist)
|
||||
|
||||
for file in file_members:
|
||||
print(f"Extracting and including {file.filename} "
|
||||
"from existing wheel")
|
||||
package_name = os.path.dirname(file.filename).replace("/", ".")
|
||||
file_name = os.path.basename(file.filename)
|
||||
|
||||
if package_name not in package_data:
|
||||
package_data[package_name] = []
|
||||
|
||||
wheel.extract(file)
|
||||
if file_name.endswith(".py"):
|
||||
# python files shouldn't be added to package_data
|
||||
continue
|
||||
|
||||
package_data[package_name].append(file_name)
|
||||
|
||||
|
||||
def _is_hpu() -> bool:
|
||||
is_hpu_available = True
|
||||
try:
|
||||
@ -403,6 +471,8 @@ def get_vllm_version() -> str:
|
||||
# skip this for source tarball, required for pypi
|
||||
if "sdist" not in sys.argv:
|
||||
version += f"{sep}cu{cuda_version_str}"
|
||||
if envs.VLLM_USE_PRECOMPILED:
|
||||
version += ".precompiled"
|
||||
elif _is_hip():
|
||||
# Get the HIP version
|
||||
hipcc_version = get_hipcc_rocm_version()
|
||||
@ -514,13 +584,18 @@ if _build_custom_ops():
|
||||
package_data = {
|
||||
"vllm": ["py.typed", "model_executor/layers/fused_moe/configs/*.json"]
|
||||
}
|
||||
if envs.VLLM_USE_PRECOMPILED:
|
||||
ext_modules = []
|
||||
package_data["vllm"].append("*.so")
|
||||
|
||||
if _no_device():
|
||||
ext_modules = []
|
||||
|
||||
if not ext_modules:
|
||||
cmdclass = {}
|
||||
else:
|
||||
cmdclass = {
|
||||
"build_ext":
|
||||
repackage_wheel if envs.VLLM_USE_PRECOMPILED else cmake_build_ext
|
||||
}
|
||||
|
||||
setup(
|
||||
name="vllm",
|
||||
version=get_vllm_version(),
|
||||
@ -557,7 +632,7 @@ setup(
|
||||
"audio": ["librosa", "soundfile"], # Required for audio processing
|
||||
"video": ["decord"] # Required for video processing
|
||||
},
|
||||
cmdclass={"build_ext": cmake_build_ext} if len(ext_modules) > 0 else {},
|
||||
cmdclass=cmdclass,
|
||||
package_data=package_data,
|
||||
entry_points={
|
||||
"console_scripts": [
|
||||
|
@ -113,7 +113,8 @@ environment_variables: Dict[str, Callable[[], Any]] = {
|
||||
|
||||
# If set, vllm will use precompiled binaries (*.so)
|
||||
"VLLM_USE_PRECOMPILED":
|
||||
lambda: bool(os.environ.get("VLLM_USE_PRECOMPILED")),
|
||||
lambda: bool(os.environ.get("VLLM_USE_PRECOMPILED")) or bool(
|
||||
os.environ.get("VLLM_PRECOMPILED_WHEEL_LOCATION")),
|
||||
|
||||
# CMake build type
|
||||
# If not set, defaults to "Debug" or "RelWithDebInfo"
|
||||
|
Loading…
x
Reference in New Issue
Block a user