vllm/docs/source/models/extensions/fastsafetensor.md

Loading Model weights with fastsafetensors
===================================================================

Using fastsafetensor library enables loading model weights to GPU memory by leveraging GPU direct storage. See https://github.com/foundation-model-stack/fastsafetensors for more details.
For enabling this feature, set the environment variable ``USE_FASTSAFETENSOR`` to ``true``
[Core] Integrate `fastsafetensors` loader for loading model weights (#10647) Signed-off-by: Manish Sethi <Manish.sethi1@ibm.com> 2025-03-24 11:08:02 -04:00			`Loading Model weights with fastsafetensors`
			`===================================================================`

			`Using fastsafetensor library enables loading model weights to GPU memory by leveraging GPU direct storage. See https://github.com/foundation-model-stack/fastsafetensors for more details.`
			For enabling this feature, set the environment variable ``USE_FASTSAFETENSOR`` to ``true``