[Doc] fix the incorrect module path of tensorize_vllm_model (#13863)

This commit is contained in:
Albert 2025-02-26 14:56:19 +08:00 committed by GitHub
parent 145944cb94
commit e656f638de
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194

View File

@ -27,7 +27,7 @@ https://github.com/coreweave/tensorizer
To serialize a model, install vLLM from source, then run something To serialize a model, install vLLM from source, then run something
like this from the root level of this repository: like this from the root level of this repository:
python -m examples.offline_inference.tensorize_vllm_model \ python -m examples.other.tensorize_vllm_model \
--model facebook/opt-125m \ --model facebook/opt-125m \
serialize \ serialize \
--serialized-directory s3://my-bucket \ --serialized-directory s3://my-bucket \
@ -47,7 +47,7 @@ providing a `--keyfile` argument.
To deserialize a model, you can run something like this from the root To deserialize a model, you can run something like this from the root
level of this repository: level of this repository:
python -m examples.offline_inference.tensorize_vllm_model \ python -m examples.other.tensorize_vllm_model \
--model EleutherAI/gpt-j-6B \ --model EleutherAI/gpt-j-6B \
--dtype float16 \ --dtype float16 \
deserialize \ deserialize \
@ -65,11 +65,11 @@ shard's rank. Sharded models serialized with this script will be named as
model-rank-%03d.tensors model-rank-%03d.tensors
For more information on the available arguments for serializing, run For more information on the available arguments for serializing, run
`python -m examples.offline_inference.tensorize_vllm_model serialize --help`. `python -m examples.other.tensorize_vllm_model serialize --help`.
Or for deserializing: Or for deserializing:
`python -m examples.offline_inference.tensorize_vllm_model deserialize --help`. `python -m examples.other.tensorize_vllm_model deserialize --help`.
Once a model is serialized, tensorizer can be invoked with the `LLM` class Once a model is serialized, tensorizer can be invoked with the `LLM` class
directly to load models: directly to load models:
@ -90,7 +90,7 @@ TensorizerConfig arguments desired.
In order to see all of the available arguments usable to configure In order to see all of the available arguments usable to configure
loading with tensorizer that are given to `TensorizerConfig`, run: loading with tensorizer that are given to `TensorizerConfig`, run:
`python -m examples.offline_inference.tensorize_vllm_model deserialize --help` `python -m examples.other.tensorize_vllm_model deserialize --help`
under the `tensorizer options` section. These can also be used for under the `tensorizer options` section. These can also be used for
deserialization in this example script, although `--tensorizer-uri` and deserialization in this example script, although `--tensorizer-uri` and