[Doc] Reorganize Supported Models by Type (#6167)
This commit is contained in:
parent
bc96d5c330
commit
175c43eca4
@ -7,6 +7,8 @@ vLLM supports a variety of generative Transformer models in `HuggingFace Transfo
|
|||||||
The following is the list of model architectures that are currently supported by vLLM.
|
The following is the list of model architectures that are currently supported by vLLM.
|
||||||
Alongside each architecture, we include some popular models that use it.
|
Alongside each architecture, we include some popular models that use it.
|
||||||
|
|
||||||
|
Decoder-only Language Models
|
||||||
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||||||
.. list-table::
|
.. list-table::
|
||||||
:widths: 25 25 50 5
|
:widths: 25 25 50 5
|
||||||
:header-rows: 1
|
:header-rows: 1
|
||||||
@ -95,14 +97,6 @@ Alongside each architecture, we include some popular models that use it.
|
|||||||
- LLaMA, Llama 2, Meta Llama 3, Vicuna, Alpaca, Yi
|
- LLaMA, Llama 2, Meta Llama 3, Vicuna, Alpaca, Yi
|
||||||
- :code:`meta-llama/Meta-Llama-3-8B-Instruct`, :code:`meta-llama/Meta-Llama-3-70B-Instruct`, :code:`meta-llama/Llama-2-13b-hf`, :code:`meta-llama/Llama-2-70b-hf`, :code:`openlm-research/open_llama_13b`, :code:`lmsys/vicuna-13b-v1.3`, :code:`01-ai/Yi-6B`, :code:`01-ai/Yi-34B`, etc.
|
- :code:`meta-llama/Meta-Llama-3-8B-Instruct`, :code:`meta-llama/Meta-Llama-3-70B-Instruct`, :code:`meta-llama/Llama-2-13b-hf`, :code:`meta-llama/Llama-2-70b-hf`, :code:`openlm-research/open_llama_13b`, :code:`lmsys/vicuna-13b-v1.3`, :code:`01-ai/Yi-6B`, :code:`01-ai/Yi-34B`, etc.
|
||||||
- ✅︎
|
- ✅︎
|
||||||
* - :code:`LlavaForConditionalGeneration`
|
|
||||||
- LLaVA-1.5
|
|
||||||
- :code:`llava-hf/llava-1.5-7b-hf`, :code:`llava-hf/llava-1.5-13b-hf`, etc.
|
|
||||||
-
|
|
||||||
* - :code:`LlavaNextForConditionalGeneration`
|
|
||||||
- LLaVA-NeXT
|
|
||||||
- :code:`llava-hf/llava-v1.6-mistral-7b-hf`, :code:`llava-hf/llava-v1.6-vicuna-7b-hf`, etc.
|
|
||||||
-
|
|
||||||
* - :code:`MiniCPMForCausalLM`
|
* - :code:`MiniCPMForCausalLM`
|
||||||
- MiniCPM
|
- MiniCPM
|
||||||
- :code:`openbmb/MiniCPM-2B-sft-bf16`, :code:`openbmb/MiniCPM-2B-dpo-bf16`, etc.
|
- :code:`openbmb/MiniCPM-2B-sft-bf16`, :code:`openbmb/MiniCPM-2B-dpo-bf16`, etc.
|
||||||
@ -143,10 +137,6 @@ Alongside each architecture, we include some popular models that use it.
|
|||||||
- Phi-3-Small
|
- Phi-3-Small
|
||||||
- :code:`microsoft/Phi-3-small-8k-instruct`, :code:`microsoft/Phi-3-small-128k-instruct`, etc.
|
- :code:`microsoft/Phi-3-small-8k-instruct`, :code:`microsoft/Phi-3-small-128k-instruct`, etc.
|
||||||
-
|
-
|
||||||
* - :code:`Phi3VForCausalLM`
|
|
||||||
- Phi-3-Vision
|
|
||||||
- :code:`microsoft/Phi-3-vision-128k-instruct`, etc.
|
|
||||||
-
|
|
||||||
* - :code:`QWenLMHeadModel`
|
* - :code:`QWenLMHeadModel`
|
||||||
- Qwen
|
- Qwen
|
||||||
- :code:`Qwen/Qwen-7B`, :code:`Qwen/Qwen-7B-Chat`, etc.
|
- :code:`Qwen/Qwen-7B`, :code:`Qwen/Qwen-7B-Chat`, etc.
|
||||||
@ -172,14 +162,40 @@ Alongside each architecture, we include some popular models that use it.
|
|||||||
- :code:`xverse/XVERSE-7B-Chat`, :code:`xverse/XVERSE-13B-Chat`, :code:`xverse/XVERSE-65B-Chat`, etc.
|
- :code:`xverse/XVERSE-7B-Chat`, :code:`xverse/XVERSE-13B-Chat`, :code:`xverse/XVERSE-65B-Chat`, etc.
|
||||||
-
|
-
|
||||||
|
|
||||||
|
|
||||||
If your model uses one of the above model architectures, you can seamlessly run your model with vLLM.
|
|
||||||
Otherwise, please refer to :ref:`Adding a New Model <adding_a_new_model>` for instructions on how to implement support for your model.
|
|
||||||
Alternatively, you can raise an issue on our `GitHub <https://github.com/vllm-project/vllm/issues>`_ project.
|
|
||||||
|
|
||||||
.. note::
|
.. note::
|
||||||
Currently, the ROCm version of vLLM supports Mistral and Mixtral only for context lengths up to 4096.
|
Currently, the ROCm version of vLLM supports Mistral and Mixtral only for context lengths up to 4096.
|
||||||
|
|
||||||
|
.. _supported_vlms:
|
||||||
|
|
||||||
|
Vision Language Models
|
||||||
|
^^^^^^^^^^^^^^^^^^^^^^^
|
||||||
|
|
||||||
|
.. list-table::
|
||||||
|
:widths: 25 25 50 5
|
||||||
|
:header-rows: 1
|
||||||
|
|
||||||
|
* - Architecture
|
||||||
|
- Models
|
||||||
|
- Example HuggingFace Models
|
||||||
|
- :ref:`LoRA <lora>`
|
||||||
|
* - :code:`LlavaForConditionalGeneration`
|
||||||
|
- LLaVA-1.5
|
||||||
|
- :code:`llava-hf/llava-1.5-7b-hf`, :code:`llava-hf/llava-1.5-13b-hf`, etc.
|
||||||
|
-
|
||||||
|
* - :code:`LlavaNextForConditionalGeneration`
|
||||||
|
- LLaVA-NeXT
|
||||||
|
- :code:`llava-hf/llava-v1.6-mistral-7b-hf`, :code:`llava-hf/llava-v1.6-vicuna-7b-hf`, etc.
|
||||||
|
-
|
||||||
|
* - :code:`Phi3VForCausalLM`
|
||||||
|
- Phi-3-Vision
|
||||||
|
- :code:`microsoft/Phi-3-vision-128k-instruct`, etc.
|
||||||
|
-
|
||||||
|
|
||||||
|
If your model uses one of the above model architectures, you can seamlessly run your model with vLLM.
|
||||||
|
Otherwise, please refer to :ref:`Adding a New Model <adding_a_new_model>` and :ref:`Adding a New Multimodal Model <adding_a_new_multimodal_model>`
|
||||||
|
for instructions on how to implement support for your model.
|
||||||
|
Alternatively, you can raise an issue on our `GitHub <https://github.com/vllm-project/vllm/issues>`_ project.
|
||||||
|
|
||||||
.. tip::
|
.. tip::
|
||||||
The easiest way to check if your model is supported is to run the program below:
|
The easiest way to check if your model is supported is to run the program below:
|
||||||
|
|
||||||
@ -210,8 +226,9 @@ Alternatively, you can raise an issue on our `GitHub <https://github.com/vllm-pr
|
|||||||
output = llm.generate("Hello, my name is")
|
output = llm.generate("Hello, my name is")
|
||||||
print(output)
|
print(output)
|
||||||
|
|
||||||
|
|
||||||
Model Support Policy
|
Model Support Policy
|
||||||
---------------------
|
=====================
|
||||||
|
|
||||||
At vLLM, we are committed to facilitating the integration and support of third-party models within our ecosystem. Our approach is designed to balance the need for robustness and the practical limitations of supporting a wide range of models. Here’s how we manage third-party model support:
|
At vLLM, we are committed to facilitating the integration and support of third-party models within our ecosystem. Our approach is designed to balance the need for robustness and the practical limitations of supporting a wide range of models. Here’s how we manage third-party model support:
|
||||||
|
|
||||||
|
@ -3,7 +3,8 @@
|
|||||||
Using VLMs
|
Using VLMs
|
||||||
==========
|
==========
|
||||||
|
|
||||||
vLLM provides experimental support for Vision Language Models (VLMs). This document shows you how to run and serve these models using vLLM.
|
vLLM provides experimental support for Vision Language Models (VLMs). See the :ref:`list of supported VLMs here <supported_vlms>`.
|
||||||
|
This document shows you how to run and serve these models using vLLM.
|
||||||
|
|
||||||
.. important::
|
.. important::
|
||||||
We are actively iterating on VLM support. Expect breaking changes to VLM usage and development in upcoming releases without prior deprecation.
|
We are actively iterating on VLM support. Expect breaking changes to VLM usage and development in upcoming releases without prior deprecation.
|
||||||
|
Loading…
x
Reference in New Issue
Block a user