1.6 KiB
1.6 KiB
(multi-modality)=
Multi-Modality
.. currentmodule:: vllm.multimodal
vLLM provides experimental support for multi-modal models through the {mod}vllm.multimodal
package.
Multi-modal inputs can be passed alongside text and token prompts to supported models
via the multi_modal_data
field in {class}vllm.inputs.PromptType
.
Currently, vLLM only has built-in support for image data. You can extend vLLM to process additional modalities by following this guide.
Looking to add your own multi-modal model? Please follow the instructions listed here.
Guides
:maxdepth: 1
adding_multimodal_plugin
Module Contents
.. automodule:: vllm.multimodal
Registry
.. autodata:: vllm.multimodal.MULTIMODAL_REGISTRY
.. autoclass:: vllm.multimodal.MultiModalRegistry
:members:
:show-inheritance:
Base Classes
.. autodata:: vllm.multimodal.NestedTensors
.. autodata:: vllm.multimodal.BatchedTensorInputs
.. autoclass:: vllm.multimodal.MultiModalDataBuiltins
:members:
:show-inheritance:
.. autodata:: vllm.multimodal.MultiModalDataDict
.. autoclass:: vllm.multimodal.MultiModalKwargs
:members:
:show-inheritance:
.. autoclass:: vllm.multimodal.MultiModalPlugin
:members:
:show-inheritance:
Image Classes
.. automodule:: vllm.multimodal.image
:members:
:show-inheritance: