[Doc] fix the incorrect module path of tensorize_vllm_model (#13863)
This commit is contained in:
parent
145944cb94
commit
e656f638de
@ -27,7 +27,7 @@ https://github.com/coreweave/tensorizer
|
|||||||
To serialize a model, install vLLM from source, then run something
|
To serialize a model, install vLLM from source, then run something
|
||||||
like this from the root level of this repository:
|
like this from the root level of this repository:
|
||||||
|
|
||||||
python -m examples.offline_inference.tensorize_vllm_model \
|
python -m examples.other.tensorize_vllm_model \
|
||||||
--model facebook/opt-125m \
|
--model facebook/opt-125m \
|
||||||
serialize \
|
serialize \
|
||||||
--serialized-directory s3://my-bucket \
|
--serialized-directory s3://my-bucket \
|
||||||
@ -47,7 +47,7 @@ providing a `--keyfile` argument.
|
|||||||
To deserialize a model, you can run something like this from the root
|
To deserialize a model, you can run something like this from the root
|
||||||
level of this repository:
|
level of this repository:
|
||||||
|
|
||||||
python -m examples.offline_inference.tensorize_vllm_model \
|
python -m examples.other.tensorize_vllm_model \
|
||||||
--model EleutherAI/gpt-j-6B \
|
--model EleutherAI/gpt-j-6B \
|
||||||
--dtype float16 \
|
--dtype float16 \
|
||||||
deserialize \
|
deserialize \
|
||||||
@ -65,11 +65,11 @@ shard's rank. Sharded models serialized with this script will be named as
|
|||||||
model-rank-%03d.tensors
|
model-rank-%03d.tensors
|
||||||
|
|
||||||
For more information on the available arguments for serializing, run
|
For more information on the available arguments for serializing, run
|
||||||
`python -m examples.offline_inference.tensorize_vllm_model serialize --help`.
|
`python -m examples.other.tensorize_vllm_model serialize --help`.
|
||||||
|
|
||||||
Or for deserializing:
|
Or for deserializing:
|
||||||
|
|
||||||
`python -m examples.offline_inference.tensorize_vllm_model deserialize --help`.
|
`python -m examples.other.tensorize_vllm_model deserialize --help`.
|
||||||
|
|
||||||
Once a model is serialized, tensorizer can be invoked with the `LLM` class
|
Once a model is serialized, tensorizer can be invoked with the `LLM` class
|
||||||
directly to load models:
|
directly to load models:
|
||||||
@ -90,7 +90,7 @@ TensorizerConfig arguments desired.
|
|||||||
In order to see all of the available arguments usable to configure
|
In order to see all of the available arguments usable to configure
|
||||||
loading with tensorizer that are given to `TensorizerConfig`, run:
|
loading with tensorizer that are given to `TensorizerConfig`, run:
|
||||||
|
|
||||||
`python -m examples.offline_inference.tensorize_vllm_model deserialize --help`
|
`python -m examples.other.tensorize_vllm_model deserialize --help`
|
||||||
|
|
||||||
under the `tensorizer options` section. These can also be used for
|
under the `tensorizer options` section. These can also be used for
|
||||||
deserialization in this example script, although `--tensorizer-uri` and
|
deserialization in this example script, although `--tensorizer-uri` and
|
||||||
|
Loading…
x
Reference in New Issue
Block a user