[Doc] Convert list tables to MyST (#11594)

Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
This commit is contained in:
Cyrus Leung 2024-12-29 15:56:22 +08:00 committed by GitHub
parent 4fb8e329fd
commit 32b4c63f02
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194
6 changed files with 951 additions and 965 deletions

View File

@ -197,4 +197,4 @@ if __name__ == '__main__':
## Known Issues ## Known Issues
- In `v0.5.2`, `v0.5.3`, and `v0.5.3.post1`, there is a bug caused by [zmq](https://github.com/zeromq/pyzmq/issues/2000) , which can occasionally cause vLLM to hang depending on the machine configuration. The solution is to upgrade to the latest version of `vllm` to include the [fix](gh-pr:6759). - In `v0.5.2`, `v0.5.3`, and `v0.5.3.post1`, there is a bug caused by [zmq](https://github.com/zeromq/pyzmq/issues/2000) , which can occasionally cause vLLM to hang depending on the machine configuration. The solution is to upgrade to the latest version of `vllm` to include the [fix](gh-pr:6759).
- To circumvent a NCCL [bug](https://github.com/NVIDIA/nccl/issues/1234) , all vLLM processes will set an environment variable ``NCCL_CUMEM_ENABLE=0`` to disable NCCL's ``cuMem`` allocator. It does not affect performance but only gives memory benefits. When external processes want to set up a NCCL connection with vLLM's processes, they should also set this environment variable, otherwise, inconsistent environment setup will cause NCCL to hang or crash, as observed in the [RLHF integration](https://github.com/OpenRLHF/OpenRLHF/pull/604) and the [discussion](gh-issue:5723#issuecomment-2554389656) . - To circumvent a NCCL [bug](https://github.com/NVIDIA/nccl/issues/1234) , all vLLM processes will set an environment variable `NCCL_CUMEM_ENABLE=0` to disable NCCL's `cuMem` allocator. It does not affect performance but only gives memory benefits. When external processes want to set up a NCCL connection with vLLM's processes, they should also set this environment variable, otherwise, inconsistent environment setup will cause NCCL to hang or crash, as observed in the [RLHF integration](https://github.com/OpenRLHF/OpenRLHF/pull/604) and the [discussion](gh-issue:5723#issuecomment-2554389656) .

View File

@ -141,13 +141,12 @@ Gaudi2 devices. Configurations that are not listed may or may not work.
Currently in vLLM for HPU we support four execution modes, depending on selected HPU PyTorch Bridge backend (via `PT_HPU_LAZY_MODE` environment variable), and `--enforce-eager` flag. Currently in vLLM for HPU we support four execution modes, depending on selected HPU PyTorch Bridge backend (via `PT_HPU_LAZY_MODE` environment variable), and `--enforce-eager` flag.
```{eval-rst} ```{list-table} vLLM execution modes
.. list-table:: vLLM execution modes
:widths: 25 25 50 :widths: 25 25 50
:header-rows: 1 :header-rows: 1
* - ``PT_HPU_LAZY_MODE`` * - `PT_HPU_LAZY_MODE`
- ``enforce_eager`` - `enforce_eager`
- execution mode - execution mode
* - 0 * - 0
- 0 - 0

View File

@ -68,8 +68,7 @@ gcloud alpha compute tpus queued-resources create QUEUED_RESOURCE_ID \
--service-account SERVICE_ACCOUNT --service-account SERVICE_ACCOUNT
``` ```
```{eval-rst} ```{list-table} Parameter descriptions
.. list-table:: Parameter descriptions
:header-rows: 1 :header-rows: 1
* - Parameter name * - Parameter name

View File

@ -72,289 +72,288 @@ See [this page](#generative-models) for more information on how to use generativ
#### Text Generation (`--task generate`) #### Text Generation (`--task generate`)
```{eval-rst} ```{list-table}
.. list-table::
:widths: 25 25 50 5 5 :widths: 25 25 50 5 5
:header-rows: 1 :header-rows: 1
* - Architecture * - Architecture
- Models - Models
- Example HF Models - Example HF Models
- :ref:`LoRA <lora-adapter>` - [LoRA](#lora-adapter)
- :ref:`PP <distributed-serving>` - [PP](#distributed-serving)
* - :code:`AquilaForCausalLM` * - `AquilaForCausalLM`
- Aquila, Aquila2 - Aquila, Aquila2
- :code:`BAAI/Aquila-7B`, :code:`BAAI/AquilaChat-7B`, etc. - `BAAI/Aquila-7B`, `BAAI/AquilaChat-7B`, etc.
- ✅︎ - ✅︎
- ✅︎ - ✅︎
* - :code:`ArcticForCausalLM` * - `ArcticForCausalLM`
- Arctic - Arctic
- :code:`Snowflake/snowflake-arctic-base`, :code:`Snowflake/snowflake-arctic-instruct`, etc. - `Snowflake/snowflake-arctic-base`, `Snowflake/snowflake-arctic-instruct`, etc.
- -
- ✅︎ - ✅︎
* - :code:`BaiChuanForCausalLM` * - `BaiChuanForCausalLM`
- Baichuan2, Baichuan - Baichuan2, Baichuan
- :code:`baichuan-inc/Baichuan2-13B-Chat`, :code:`baichuan-inc/Baichuan-7B`, etc. - `baichuan-inc/Baichuan2-13B-Chat`, `baichuan-inc/Baichuan-7B`, etc.
- ✅︎ - ✅︎
- ✅︎ - ✅︎
* - :code:`BloomForCausalLM` * - `BloomForCausalLM`
- BLOOM, BLOOMZ, BLOOMChat - BLOOM, BLOOMZ, BLOOMChat
- :code:`bigscience/bloom`, :code:`bigscience/bloomz`, etc. - `bigscience/bloom`, `bigscience/bloomz`, etc.
- -
- ✅︎ - ✅︎
* - :code:`BartForConditionalGeneration` * - `BartForConditionalGeneration`
- BART - BART
- :code:`facebook/bart-base`, :code:`facebook/bart-large-cnn`, etc. - `facebook/bart-base`, `facebook/bart-large-cnn`, etc.
- -
- -
* - :code:`ChatGLMModel` * - `ChatGLMModel`
- ChatGLM - ChatGLM
- :code:`THUDM/chatglm2-6b`, :code:`THUDM/chatglm3-6b`, etc. - `THUDM/chatglm2-6b`, `THUDM/chatglm3-6b`, etc.
- ✅︎ - ✅︎
- ✅︎ - ✅︎
* - :code:`CohereForCausalLM`, :code:`Cohere2ForCausalLM` * - `CohereForCausalLM`, `Cohere2ForCausalLM`
- Command-R - Command-R
- :code:`CohereForAI/c4ai-command-r-v01`, :code:`CohereForAI/c4ai-command-r7b-12-2024`, etc. - `CohereForAI/c4ai-command-r-v01`, `CohereForAI/c4ai-command-r7b-12-2024`, etc.
- ✅︎ - ✅︎
- ✅︎ - ✅︎
* - :code:`DbrxForCausalLM` * - `DbrxForCausalLM`
- DBRX - DBRX
- :code:`databricks/dbrx-base`, :code:`databricks/dbrx-instruct`, etc. - `databricks/dbrx-base`, `databricks/dbrx-instruct`, etc.
- -
- ✅︎ - ✅︎
* - :code:`DeciLMForCausalLM` * - `DeciLMForCausalLM`
- DeciLM - DeciLM
- :code:`Deci/DeciLM-7B`, :code:`Deci/DeciLM-7B-instruct`, etc. - `Deci/DeciLM-7B`, `Deci/DeciLM-7B-instruct`, etc.
- -
- ✅︎ - ✅︎
* - :code:`DeepseekForCausalLM` * - `DeepseekForCausalLM`
- DeepSeek - DeepSeek
- :code:`deepseek-ai/deepseek-llm-67b-base`, :code:`deepseek-ai/deepseek-llm-7b-chat` etc. - `deepseek-ai/deepseek-llm-67b-base`, `deepseek-ai/deepseek-llm-7b-chat` etc.
- -
- ✅︎ - ✅︎
* - :code:`DeepseekV2ForCausalLM` * - `DeepseekV2ForCausalLM`
- DeepSeek-V2 - DeepSeek-V2
- :code:`deepseek-ai/DeepSeek-V2`, :code:`deepseek-ai/DeepSeek-V2-Chat` etc. - `deepseek-ai/DeepSeek-V2`, `deepseek-ai/DeepSeek-V2-Chat` etc.
- -
- ✅︎ - ✅︎
* - :code:`DeepseekV3ForCausalLM` * - `DeepseekV3ForCausalLM`
- DeepSeek-V3 - DeepSeek-V3
- :code:`deepseek-ai/DeepSeek-V3-Base`, :code:`deepseek-ai/DeepSeek-V3` etc. - `deepseek-ai/DeepSeek-V3-Base`, `deepseek-ai/DeepSeek-V3` etc.
- -
- ✅︎ - ✅︎
* - :code:`ExaoneForCausalLM` * - `ExaoneForCausalLM`
- EXAONE-3 - EXAONE-3
- :code:`LGAI-EXAONE/EXAONE-3.0-7.8B-Instruct`, etc. - `LGAI-EXAONE/EXAONE-3.0-7.8B-Instruct`, etc.
- ✅︎ - ✅︎
- ✅︎ - ✅︎
* - :code:`FalconForCausalLM` * - `FalconForCausalLM`
- Falcon - Falcon
- :code:`tiiuae/falcon-7b`, :code:`tiiuae/falcon-40b`, :code:`tiiuae/falcon-rw-7b`, etc. - `tiiuae/falcon-7b`, `tiiuae/falcon-40b`, `tiiuae/falcon-rw-7b`, etc.
- -
- ✅︎ - ✅︎
* - :code:`FalconMambaForCausalLM` * - `FalconMambaForCausalLM`
- FalconMamba - FalconMamba
- :code:`tiiuae/falcon-mamba-7b`, :code:`tiiuae/falcon-mamba-7b-instruct`, etc. - `tiiuae/falcon-mamba-7b`, `tiiuae/falcon-mamba-7b-instruct`, etc.
- ✅︎ - ✅︎
- ✅︎ - ✅︎
* - :code:`GemmaForCausalLM` * - `GemmaForCausalLM`
- Gemma - Gemma
- :code:`google/gemma-2b`, :code:`google/gemma-7b`, etc. - `google/gemma-2b`, `google/gemma-7b`, etc.
- ✅︎ - ✅︎
- ✅︎ - ✅︎
* - :code:`Gemma2ForCausalLM` * - `Gemma2ForCausalLM`
- Gemma2 - Gemma2
- :code:`google/gemma-2-9b`, :code:`google/gemma-2-27b`, etc. - `google/gemma-2-9b`, `google/gemma-2-27b`, etc.
- ✅︎ - ✅︎
- ✅︎ - ✅︎
* - :code:`GlmForCausalLM` * - `GlmForCausalLM`
- GLM-4 - GLM-4
- :code:`THUDM/glm-4-9b-chat-hf`, etc. - `THUDM/glm-4-9b-chat-hf`, etc.
- ✅︎ - ✅︎
- ✅︎ - ✅︎
* - :code:`GPT2LMHeadModel` * - `GPT2LMHeadModel`
- GPT-2 - GPT-2
- :code:`gpt2`, :code:`gpt2-xl`, etc. - `gpt2`, `gpt2-xl`, etc.
- -
- ✅︎ - ✅︎
* - :code:`GPTBigCodeForCausalLM` * - `GPTBigCodeForCausalLM`
- StarCoder, SantaCoder, WizardCoder - StarCoder, SantaCoder, WizardCoder
- :code:`bigcode/starcoder`, :code:`bigcode/gpt_bigcode-santacoder`, :code:`WizardLM/WizardCoder-15B-V1.0`, etc. - `bigcode/starcoder`, `bigcode/gpt_bigcode-santacoder`, `WizardLM/WizardCoder-15B-V1.0`, etc.
- ✅︎ - ✅︎
- ✅︎ - ✅︎
* - :code:`GPTJForCausalLM` * - `GPTJForCausalLM`
- GPT-J - GPT-J
- :code:`EleutherAI/gpt-j-6b`, :code:`nomic-ai/gpt4all-j`, etc. - `EleutherAI/gpt-j-6b`, `nomic-ai/gpt4all-j`, etc.
- -
- ✅︎ - ✅︎
* - :code:`GPTNeoXForCausalLM` * - `GPTNeoXForCausalLM`
- GPT-NeoX, Pythia, OpenAssistant, Dolly V2, StableLM - GPT-NeoX, Pythia, OpenAssistant, Dolly V2, StableLM
- :code:`EleutherAI/gpt-neox-20b`, :code:`EleutherAI/pythia-12b`, :code:`OpenAssistant/oasst-sft-4-pythia-12b-epoch-3.5`, :code:`databricks/dolly-v2-12b`, :code:`stabilityai/stablelm-tuned-alpha-7b`, etc. - `EleutherAI/gpt-neox-20b`, `EleutherAI/pythia-12b`, `OpenAssistant/oasst-sft-4-pythia-12b-epoch-3.5`, `databricks/dolly-v2-12b`, `stabilityai/stablelm-tuned-alpha-7b`, etc.
- -
- ✅︎ - ✅︎
* - :code:`GraniteForCausalLM` * - `GraniteForCausalLM`
- Granite 3.0, Granite 3.1, PowerLM - Granite 3.0, Granite 3.1, PowerLM
- :code:`ibm-granite/granite-3.0-2b-base`, :code:`ibm-granite/granite-3.1-8b-instruct`, :code:`ibm/PowerLM-3b`, etc. - `ibm-granite/granite-3.0-2b-base`, `ibm-granite/granite-3.1-8b-instruct`, `ibm/PowerLM-3b`, etc.
- ✅︎ - ✅︎
- ✅︎ - ✅︎
* - :code:`GraniteMoeForCausalLM` * - `GraniteMoeForCausalLM`
- Granite 3.0 MoE, PowerMoE - Granite 3.0 MoE, PowerMoE
- :code:`ibm-granite/granite-3.0-1b-a400m-base`, :code:`ibm-granite/granite-3.0-3b-a800m-instruct`, :code:`ibm/PowerMoE-3b`, etc. - `ibm-granite/granite-3.0-1b-a400m-base`, `ibm-granite/granite-3.0-3b-a800m-instruct`, `ibm/PowerMoE-3b`, etc.
- ✅︎ - ✅︎
- ✅︎ - ✅︎
* - :code:`GritLM` * - `GritLM`
- GritLM - GritLM
- :code:`parasail-ai/GritLM-7B-vllm`. - `parasail-ai/GritLM-7B-vllm`.
- ✅︎ - ✅︎
- ✅︎ - ✅︎
* - :code:`InternLMForCausalLM` * - `InternLMForCausalLM`
- InternLM - InternLM
- :code:`internlm/internlm-7b`, :code:`internlm/internlm-chat-7b`, etc. - `internlm/internlm-7b`, `internlm/internlm-chat-7b`, etc.
- ✅︎ - ✅︎
- ✅︎ - ✅︎
* - :code:`InternLM2ForCausalLM` * - `InternLM2ForCausalLM`
- InternLM2 - InternLM2
- :code:`internlm/internlm2-7b`, :code:`internlm/internlm2-chat-7b`, etc. - `internlm/internlm2-7b`, `internlm/internlm2-chat-7b`, etc.
- ✅︎ - ✅︎
- ✅︎ - ✅︎
* - :code:`JAISLMHeadModel` * - `JAISLMHeadModel`
- Jais - Jais
- :code:`inceptionai/jais-13b`, :code:`inceptionai/jais-13b-chat`, :code:`inceptionai/jais-30b-v3`, :code:`inceptionai/jais-30b-chat-v3`, etc. - `inceptionai/jais-13b`, `inceptionai/jais-13b-chat`, `inceptionai/jais-30b-v3`, `inceptionai/jais-30b-chat-v3`, etc.
- -
- ✅︎ - ✅︎
* - :code:`JambaForCausalLM` * - `JambaForCausalLM`
- Jamba - Jamba
- :code:`ai21labs/AI21-Jamba-1.5-Large`, :code:`ai21labs/AI21-Jamba-1.5-Mini`, :code:`ai21labs/Jamba-v0.1`, etc. - `ai21labs/AI21-Jamba-1.5-Large`, `ai21labs/AI21-Jamba-1.5-Mini`, `ai21labs/Jamba-v0.1`, etc.
- ✅︎ - ✅︎
- ✅︎ - ✅︎
* - :code:`LlamaForCausalLM` * - `LlamaForCausalLM`
- Llama 3.1, Llama 3, Llama 2, LLaMA, Yi - Llama 3.1, Llama 3, Llama 2, LLaMA, Yi
- :code:`meta-llama/Meta-Llama-3.1-405B-Instruct`, :code:`meta-llama/Meta-Llama-3.1-70B`, :code:`meta-llama/Meta-Llama-3-70B-Instruct`, :code:`meta-llama/Llama-2-70b-hf`, :code:`01-ai/Yi-34B`, etc. - `meta-llama/Meta-Llama-3.1-405B-Instruct`, `meta-llama/Meta-Llama-3.1-70B`, `meta-llama/Meta-Llama-3-70B-Instruct`, `meta-llama/Llama-2-70b-hf`, `01-ai/Yi-34B`, etc.
- ✅︎ - ✅︎
- ✅︎ - ✅︎
* - :code:`MambaForCausalLM` * - `MambaForCausalLM`
- Mamba - Mamba
- :code:`state-spaces/mamba-130m-hf`, :code:`state-spaces/mamba-790m-hf`, :code:`state-spaces/mamba-2.8b-hf`, etc. - `state-spaces/mamba-130m-hf`, `state-spaces/mamba-790m-hf`, `state-spaces/mamba-2.8b-hf`, etc.
- -
- ✅︎ - ✅︎
* - :code:`MiniCPMForCausalLM` * - `MiniCPMForCausalLM`
- MiniCPM - MiniCPM
- :code:`openbmb/MiniCPM-2B-sft-bf16`, :code:`openbmb/MiniCPM-2B-dpo-bf16`, :code:`openbmb/MiniCPM-S-1B-sft`, etc. - `openbmb/MiniCPM-2B-sft-bf16`, `openbmb/MiniCPM-2B-dpo-bf16`, `openbmb/MiniCPM-S-1B-sft`, etc.
- ✅︎ - ✅︎
- ✅︎ - ✅︎
* - :code:`MiniCPM3ForCausalLM` * - `MiniCPM3ForCausalLM`
- MiniCPM3 - MiniCPM3
- :code:`openbmb/MiniCPM3-4B`, etc. - `openbmb/MiniCPM3-4B`, etc.
- ✅︎ - ✅︎
- ✅︎ - ✅︎
* - :code:`MistralForCausalLM` * - `MistralForCausalLM`
- Mistral, Mistral-Instruct - Mistral, Mistral-Instruct
- :code:`mistralai/Mistral-7B-v0.1`, :code:`mistralai/Mistral-7B-Instruct-v0.1`, etc. - `mistralai/Mistral-7B-v0.1`, `mistralai/Mistral-7B-Instruct-v0.1`, etc.
- ✅︎ - ✅︎
- ✅︎ - ✅︎
* - :code:`MixtralForCausalLM` * - `MixtralForCausalLM`
- Mixtral-8x7B, Mixtral-8x7B-Instruct - Mixtral-8x7B, Mixtral-8x7B-Instruct
- :code:`mistralai/Mixtral-8x7B-v0.1`, :code:`mistralai/Mixtral-8x7B-Instruct-v0.1`, :code:`mistral-community/Mixtral-8x22B-v0.1`, etc. - `mistralai/Mixtral-8x7B-v0.1`, `mistralai/Mixtral-8x7B-Instruct-v0.1`, `mistral-community/Mixtral-8x22B-v0.1`, etc.
- ✅︎ - ✅︎
- ✅︎ - ✅︎
* - :code:`MPTForCausalLM` * - `MPTForCausalLM`
- MPT, MPT-Instruct, MPT-Chat, MPT-StoryWriter - MPT, MPT-Instruct, MPT-Chat, MPT-StoryWriter
- :code:`mosaicml/mpt-7b`, :code:`mosaicml/mpt-7b-storywriter`, :code:`mosaicml/mpt-30b`, etc. - `mosaicml/mpt-7b`, `mosaicml/mpt-7b-storywriter`, `mosaicml/mpt-30b`, etc.
- -
- ✅︎ - ✅︎
* - :code:`NemotronForCausalLM` * - `NemotronForCausalLM`
- Nemotron-3, Nemotron-4, Minitron - Nemotron-3, Nemotron-4, Minitron
- :code:`nvidia/Minitron-8B-Base`, :code:`mgoin/Nemotron-4-340B-Base-hf-FP8`, etc. - `nvidia/Minitron-8B-Base`, `mgoin/Nemotron-4-340B-Base-hf-FP8`, etc.
- ✅︎ - ✅︎
- ✅︎ - ✅︎
* - :code:`OLMoForCausalLM` * - `OLMoForCausalLM`
- OLMo - OLMo
- :code:`allenai/OLMo-1B-hf`, :code:`allenai/OLMo-7B-hf`, etc. - `allenai/OLMo-1B-hf`, `allenai/OLMo-7B-hf`, etc.
- -
- ✅︎ - ✅︎
* - :code:`OLMo2ForCausalLM` * - `OLMo2ForCausalLM`
- OLMo2 - OLMo2
- :code:`allenai/OLMo2-7B-1124`, etc. - `allenai/OLMo2-7B-1124`, etc.
- -
- ✅︎ - ✅︎
* - :code:`OLMoEForCausalLM` * - `OLMoEForCausalLM`
- OLMoE - OLMoE
- :code:`allenai/OLMoE-1B-7B-0924`, :code:`allenai/OLMoE-1B-7B-0924-Instruct`, etc. - `allenai/OLMoE-1B-7B-0924`, `allenai/OLMoE-1B-7B-0924-Instruct`, etc.
- ✅︎ - ✅︎
- ✅︎ - ✅︎
* - :code:`OPTForCausalLM` * - `OPTForCausalLM`
- OPT, OPT-IML - OPT, OPT-IML
- :code:`facebook/opt-66b`, :code:`facebook/opt-iml-max-30b`, etc. - `facebook/opt-66b`, `facebook/opt-iml-max-30b`, etc.
- -
- ✅︎ - ✅︎
* - :code:`OrionForCausalLM` * - `OrionForCausalLM`
- Orion - Orion
- :code:`OrionStarAI/Orion-14B-Base`, :code:`OrionStarAI/Orion-14B-Chat`, etc. - `OrionStarAI/Orion-14B-Base`, `OrionStarAI/Orion-14B-Chat`, etc.
- -
- ✅︎ - ✅︎
* - :code:`PhiForCausalLM` * - `PhiForCausalLM`
- Phi - Phi
- :code:`microsoft/phi-1_5`, :code:`microsoft/phi-2`, etc. - `microsoft/phi-1_5`, `microsoft/phi-2`, etc.
- ✅︎ - ✅︎
- ✅︎ - ✅︎
* - :code:`Phi3ForCausalLM` * - `Phi3ForCausalLM`
- Phi-3 - Phi-3
- :code:`microsoft/Phi-3-mini-4k-instruct`, :code:`microsoft/Phi-3-mini-128k-instruct`, :code:`microsoft/Phi-3-medium-128k-instruct`, etc. - `microsoft/Phi-3-mini-4k-instruct`, `microsoft/Phi-3-mini-128k-instruct`, `microsoft/Phi-3-medium-128k-instruct`, etc.
- ✅︎ - ✅︎
- ✅︎ - ✅︎
* - :code:`Phi3SmallForCausalLM` * - `Phi3SmallForCausalLM`
- Phi-3-Small - Phi-3-Small
- :code:`microsoft/Phi-3-small-8k-instruct`, :code:`microsoft/Phi-3-small-128k-instruct`, etc. - `microsoft/Phi-3-small-8k-instruct`, `microsoft/Phi-3-small-128k-instruct`, etc.
- -
- ✅︎ - ✅︎
* - :code:`PhiMoEForCausalLM` * - `PhiMoEForCausalLM`
- Phi-3.5-MoE - Phi-3.5-MoE
- :code:`microsoft/Phi-3.5-MoE-instruct`, etc. - `microsoft/Phi-3.5-MoE-instruct`, etc.
- ✅︎ - ✅︎
- ✅︎ - ✅︎
* - :code:`PersimmonForCausalLM` * - `PersimmonForCausalLM`
- Persimmon - Persimmon
- :code:`adept/persimmon-8b-base`, :code:`adept/persimmon-8b-chat`, etc. - `adept/persimmon-8b-base`, `adept/persimmon-8b-chat`, etc.
- -
- ✅︎ - ✅︎
* - :code:`QWenLMHeadModel` * - `QWenLMHeadModel`
- Qwen - Qwen
- :code:`Qwen/Qwen-7B`, :code:`Qwen/Qwen-7B-Chat`, etc. - `Qwen/Qwen-7B`, `Qwen/Qwen-7B-Chat`, etc.
- ✅︎ - ✅︎
- ✅︎ - ✅︎
* - :code:`Qwen2ForCausalLM` * - `Qwen2ForCausalLM`
- Qwen2 - Qwen2
- :code:`Qwen/QwQ-32B-Preview`, :code:`Qwen/Qwen2-7B-Instruct`, :code:`Qwen/Qwen2-7B`, etc. - `Qwen/QwQ-32B-Preview`, `Qwen/Qwen2-7B-Instruct`, `Qwen/Qwen2-7B`, etc.
- ✅︎ - ✅︎
- ✅︎ - ✅︎
* - :code:`Qwen2MoeForCausalLM` * - `Qwen2MoeForCausalLM`
- Qwen2MoE - Qwen2MoE
- :code:`Qwen/Qwen1.5-MoE-A2.7B`, :code:`Qwen/Qwen1.5-MoE-A2.7B-Chat`, etc. - `Qwen/Qwen1.5-MoE-A2.7B`, `Qwen/Qwen1.5-MoE-A2.7B-Chat`, etc.
- -
- ✅︎ - ✅︎
* - :code:`StableLmForCausalLM` * - `StableLmForCausalLM`
- StableLM - StableLM
- :code:`stabilityai/stablelm-3b-4e1t`, :code:`stabilityai/stablelm-base-alpha-7b-v2`, etc. - `stabilityai/stablelm-3b-4e1t`, `stabilityai/stablelm-base-alpha-7b-v2`, etc.
- -
- ✅︎ - ✅︎
* - :code:`Starcoder2ForCausalLM` * - `Starcoder2ForCausalLM`
- Starcoder2 - Starcoder2
- :code:`bigcode/starcoder2-3b`, :code:`bigcode/starcoder2-7b`, :code:`bigcode/starcoder2-15b`, etc. - `bigcode/starcoder2-3b`, `bigcode/starcoder2-7b`, `bigcode/starcoder2-15b`, etc.
- -
- ✅︎ - ✅︎
* - :code:`SolarForCausalLM` * - `SolarForCausalLM`
- Solar Pro - Solar Pro
- :code:`upstage/solar-pro-preview-instruct`, etc. - `upstage/solar-pro-preview-instruct`, etc.
- ✅︎ - ✅︎
- ✅︎ - ✅︎
* - :code:`TeleChat2ForCausalLM` * - `TeleChat2ForCausalLM`
- TeleChat2 - TeleChat2
- :code:`TeleAI/TeleChat2-3B`, :code:`TeleAI/TeleChat2-7B`, :code:`TeleAI/TeleChat2-35B`, etc. - `TeleAI/TeleChat2-3B`, `TeleAI/TeleChat2-7B`, `TeleAI/TeleChat2-35B`, etc.
- ✅︎ - ✅︎
- ✅︎ - ✅︎
* - :code:`XverseForCausalLM` * - `XverseForCausalLM`
- XVERSE - XVERSE
- :code:`xverse/XVERSE-7B-Chat`, :code:`xverse/XVERSE-13B-Chat`, :code:`xverse/XVERSE-65B-Chat`, etc. - `xverse/XVERSE-7B-Chat`, `xverse/XVERSE-13B-Chat`, `xverse/XVERSE-65B-Chat`, etc.
- ✅︎ - ✅︎
- ✅︎ - ✅︎
``` ```
@ -374,49 +373,48 @@ you should explicitly specify the task type to ensure that the model is used in
#### Text Embedding (`--task embed`) #### Text Embedding (`--task embed`)
```{eval-rst} ```{list-table}
.. list-table::
:widths: 25 25 50 5 5 :widths: 25 25 50 5 5
:header-rows: 1 :header-rows: 1
* - Architecture * - Architecture
- Models - Models
- Example HF Models - Example HF Models
- :ref:`LoRA <lora-adapter>` - [LoRA](#lora-adapter)
- :ref:`PP <distributed-serving>` - [PP](#distributed-serving)
* - :code:`BertModel` * - `BertModel`
- BERT-based - BERT-based
- :code:`BAAI/bge-base-en-v1.5`, etc. - `BAAI/bge-base-en-v1.5`, etc.
- -
- -
* - :code:`Gemma2Model` * - `Gemma2Model`
- Gemma2-based - Gemma2-based
- :code:`BAAI/bge-multilingual-gemma2`, etc. - `BAAI/bge-multilingual-gemma2`, etc.
- -
- ✅︎ - ✅︎
* - :code:`GritLM` * - `GritLM`
- GritLM - GritLM
- :code:`parasail-ai/GritLM-7B-vllm`. - `parasail-ai/GritLM-7B-vllm`.
- ✅︎ - ✅︎
- ✅︎ - ✅︎
* - :code:`LlamaModel`, :code:`LlamaForCausalLM`, :code:`MistralModel`, etc. * - `LlamaModel`, `LlamaForCausalLM`, `MistralModel`, etc.
- Llama-based - Llama-based
- :code:`intfloat/e5-mistral-7b-instruct`, etc. - `intfloat/e5-mistral-7b-instruct`, etc.
- ✅︎ - ✅︎
- ✅︎ - ✅︎
* - :code:`Qwen2Model`, :code:`Qwen2ForCausalLM` * - `Qwen2Model`, `Qwen2ForCausalLM`
- Qwen2-based - Qwen2-based
- :code:`ssmits/Qwen2-7B-Instruct-embed-base` (see note), :code:`Alibaba-NLP/gte-Qwen2-7B-instruct` (see note), etc. - `ssmits/Qwen2-7B-Instruct-embed-base` (see note), `Alibaba-NLP/gte-Qwen2-7B-instruct` (see note), etc.
- ✅︎ - ✅︎
- ✅︎ - ✅︎
* - :code:`RobertaModel`, :code:`RobertaForMaskedLM` * - `RobertaModel`, `RobertaForMaskedLM`
- RoBERTa-based - RoBERTa-based
- :code:`sentence-transformers/all-roberta-large-v1`, :code:`sentence-transformers/all-roberta-large-v1`, etc. - `sentence-transformers/all-roberta-large-v1`, `sentence-transformers/all-roberta-large-v1`, etc.
- -
- -
* - :code:`XLMRobertaModel` * - `XLMRobertaModel`
- XLM-RoBERTa-based - XLM-RoBERTa-based
- :code:`intfloat/multilingual-e5-large`, etc. - `intfloat/multilingual-e5-large`, etc.
- -
- -
``` ```
@ -440,29 +438,28 @@ of the whole prompt are extracted from the normalized hidden state corresponding
#### Reward Modeling (`--task reward`) #### Reward Modeling (`--task reward`)
```{eval-rst} ```{list-table}
.. list-table::
:widths: 25 25 50 5 5 :widths: 25 25 50 5 5
:header-rows: 1 :header-rows: 1
* - Architecture * - Architecture
- Models - Models
- Example HF Models - Example HF Models
- :ref:`LoRA <lora-adapter>` - [LoRA](#lora-adapter)
- :ref:`PP <distributed-serving>` - [PP](#distributed-serving)
* - :code:`InternLM2ForRewardModel` * - `InternLM2ForRewardModel`
- InternLM2-based - InternLM2-based
- :code:`internlm/internlm2-1_8b-reward`, :code:`internlm/internlm2-7b-reward`, etc. - `internlm/internlm2-1_8b-reward`, `internlm/internlm2-7b-reward`, etc.
- ✅︎ - ✅︎
- ✅︎ - ✅︎
* - :code:`LlamaForCausalLM` * - `LlamaForCausalLM`
- Llama-based - Llama-based
- :code:`peiyi9979/math-shepherd-mistral-7b-prm`, etc. - `peiyi9979/math-shepherd-mistral-7b-prm`, etc.
- ✅︎ - ✅︎
- ✅︎ - ✅︎
* - :code:`Qwen2ForRewardModel` * - `Qwen2ForRewardModel`
- Qwen2-based - Qwen2-based
- :code:`Qwen/Qwen2.5-Math-RM-72B`, etc. - `Qwen/Qwen2.5-Math-RM-72B`, etc.
- ✅︎ - ✅︎
- ✅︎ - ✅︎
``` ```
@ -477,24 +474,23 @@ e.g.: {code}`--override-pooler-config '{"pooling_type": "STEP", "step_tag_id": 1
#### Classification (`--task classify`) #### Classification (`--task classify`)
```{eval-rst} ```{list-table}
.. list-table::
:widths: 25 25 50 5 5 :widths: 25 25 50 5 5
:header-rows: 1 :header-rows: 1
* - Architecture * - Architecture
- Models - Models
- Example HF Models - Example HF Models
- :ref:`LoRA <lora-adapter>` - [LoRA](#lora-adapter)
- :ref:`PP <distributed-serving>` - [PP](#distributed-serving)
* - :code:`JambaForSequenceClassification` * - `JambaForSequenceClassification`
- Jamba - Jamba
- :code:`ai21labs/Jamba-tiny-reward-dev`, etc. - `ai21labs/Jamba-tiny-reward-dev`, etc.
- ✅︎ - ✅︎
- ✅︎ - ✅︎
* - :code:`Qwen2ForSequenceClassification` * - `Qwen2ForSequenceClassification`
- Qwen2-based - Qwen2-based
- :code:`jason9693/Qwen2.5-1.5B-apeach`, etc. - `jason9693/Qwen2.5-1.5B-apeach`, etc.
- ✅︎ - ✅︎
- ✅︎ - ✅︎
``` ```
@ -504,29 +500,28 @@ If your model is not in the above list, we will try to automatically convert the
#### Sentence Pair Scoring (`--task score`) #### Sentence Pair Scoring (`--task score`)
```{eval-rst} ```{list-table}
.. list-table::
:widths: 25 25 50 5 5 :widths: 25 25 50 5 5
:header-rows: 1 :header-rows: 1
* - Architecture * - Architecture
- Models - Models
- Example HF Models - Example HF Models
- :ref:`LoRA <lora-adapter>` - [LoRA](#lora-adapter)
- :ref:`PP <distributed-serving>` - [PP](#distributed-serving)
* - :code:`BertForSequenceClassification` * - `BertForSequenceClassification`
- BERT-based - BERT-based
- :code:`cross-encoder/ms-marco-MiniLM-L-6-v2`, etc. - `cross-encoder/ms-marco-MiniLM-L-6-v2`, etc.
- -
- -
* - :code:`RobertaForSequenceClassification` * - `RobertaForSequenceClassification`
- RoBERTa-based - RoBERTa-based
- :code:`cross-encoder/quora-roberta-base`, etc. - `cross-encoder/quora-roberta-base`, etc.
- -
- -
* - :code:`XLMRobertaForSequenceClassification` * - `XLMRobertaForSequenceClassification`
- XLM-RoBERTa-based - XLM-RoBERTa-based
- :code:`BAAI/bge-reranker-v2-m3`, etc. - `BAAI/bge-reranker-v2-m3`, etc.
- -
- -
``` ```
@ -558,8 +553,7 @@ See [this page](#generative-models) for more information on how to use generativ
#### Text Generation (`--task generate`) #### Text Generation (`--task generate`)
```{eval-rst} ```{list-table}
.. list-table::
:widths: 25 25 15 20 5 5 5 :widths: 25 25 15 20 5 5 5
:header-rows: 1 :header-rows: 1
@ -567,177 +561,174 @@ See [this page](#generative-models) for more information on how to use generativ
- Models - Models
- Inputs - Inputs
- Example HF Models - Example HF Models
- :ref:`LoRA <lora-adapter>` - [LoRA](#lora-adapter)
- :ref:`PP <distributed-serving>` - [PP](#distributed-serving)
- V1 - [V1](gh-issue:8779)
* - :code:`AriaForConditionalGeneration` * - `AriaForConditionalGeneration`
- Aria - Aria
- T + I - T + I
- :code:`rhymes-ai/Aria` - `rhymes-ai/Aria`
- -
- ✅︎ - ✅︎
- -
* - :code:`Blip2ForConditionalGeneration` * - `Blip2ForConditionalGeneration`
- BLIP-2 - BLIP-2
- T + I\ :sup:`E` - T + I<sup>E</sup>
- :code:`Salesforce/blip2-opt-2.7b`, :code:`Salesforce/blip2-opt-6.7b`, etc. - `Salesforce/blip2-opt-2.7b`, `Salesforce/blip2-opt-6.7b`, etc.
- -
- ✅︎ - ✅︎
- -
* - :code:`ChameleonForConditionalGeneration` * - `ChameleonForConditionalGeneration`
- Chameleon - Chameleon
- T + I - T + I
- :code:`facebook/chameleon-7b` etc. - `facebook/chameleon-7b` etc.
- -
- ✅︎ - ✅︎
- -
* - :code:`FuyuForCausalLM` * - `FuyuForCausalLM`
- Fuyu - Fuyu
- T + I - T + I
- :code:`adept/fuyu-8b` etc. - `adept/fuyu-8b` etc.
- -
- ✅︎ - ✅︎
- -
* - :code:`ChatGLMModel` * - `ChatGLMModel`
- GLM-4V - GLM-4V
- T + I - T + I
- :code:`THUDM/glm-4v-9b` etc. - `THUDM/glm-4v-9b` etc.
- ✅︎ - ✅︎
- ✅︎ - ✅︎
- -
* - :code:`H2OVLChatModel` * - `H2OVLChatModel`
- H2OVL - H2OVL
- T + I\ :sup:`E+` - T + I<sup>E+</sup>
- :code:`h2oai/h2ovl-mississippi-800m`, :code:`h2oai/h2ovl-mississippi-2b`, etc. - `h2oai/h2ovl-mississippi-800m`, `h2oai/h2ovl-mississippi-2b`, etc.
- -
- ✅︎ - ✅︎
- -
* - :code:`Idefics3ForConditionalGeneration` * - `Idefics3ForConditionalGeneration`
- Idefics3 - Idefics3
- T + I - T + I
- :code:`HuggingFaceM4/Idefics3-8B-Llama3` etc. - `HuggingFaceM4/Idefics3-8B-Llama3` etc.
- ✅︎ - ✅︎
- -
- -
* - :code:`InternVLChatModel` * - `InternVLChatModel`
- InternVL 2.5, Mono-InternVL, InternVL 2.0 - InternVL 2.5, Mono-InternVL, InternVL 2.0
- T + I\ :sup:`E+` - T + I<sup>E+</sup>
- :code:`OpenGVLab/InternVL2_5-4B`, :code:`OpenGVLab/Mono-InternVL-2B`, :code:`OpenGVLab/InternVL2-4B`, etc. - `OpenGVLab/InternVL2_5-4B`, `OpenGVLab/Mono-InternVL-2B`, `OpenGVLab/InternVL2-4B`, etc.
- -
- ✅︎ - ✅︎
- ✅︎ - ✅︎
* - :code:`LlavaForConditionalGeneration` * - `LlavaForConditionalGeneration`
- LLaVA-1.5 - LLaVA-1.5
- T + I\ :sup:`E+` - T + I<sup>E+</sup>
- :code:`llava-hf/llava-1.5-7b-hf`, :code:`TIGER-Lab/Mantis-8B-siglip-llama3` (see note), etc. - `llava-hf/llava-1.5-7b-hf`, `TIGER-Lab/Mantis-8B-siglip-llama3` (see note), etc.
- -
- ✅︎ - ✅︎
- ✅︎ - ✅︎
* - :code:`LlavaNextForConditionalGeneration` * - `LlavaNextForConditionalGeneration`
- LLaVA-NeXT - LLaVA-NeXT
- T + I\ :sup:`E+` - T + I<sup>E+</sup>
- :code:`llava-hf/llava-v1.6-mistral-7b-hf`, :code:`llava-hf/llava-v1.6-vicuna-7b-hf`, etc. - `llava-hf/llava-v1.6-mistral-7b-hf`, `llava-hf/llava-v1.6-vicuna-7b-hf`, etc.
- -
- ✅︎ - ✅︎
- -
* - :code:`LlavaNextVideoForConditionalGeneration` * - `LlavaNextVideoForConditionalGeneration`
- LLaVA-NeXT-Video - LLaVA-NeXT-Video
- T + V - T + V
- :code:`llava-hf/LLaVA-NeXT-Video-7B-hf`, etc. - `llava-hf/LLaVA-NeXT-Video-7B-hf`, etc.
- -
- ✅︎ - ✅︎
- -
* - :code:`LlavaOnevisionForConditionalGeneration` * - `LlavaOnevisionForConditionalGeneration`
- LLaVA-Onevision - LLaVA-Onevision
- T + I\ :sup:`+` + V\ :sup:`+` - T + I<sup>+</sup> + V<sup>+</sup>
- :code:`llava-hf/llava-onevision-qwen2-7b-ov-hf`, :code:`llava-hf/llava-onevision-qwen2-0.5b-ov-hf`, etc. - `llava-hf/llava-onevision-qwen2-7b-ov-hf`, `llava-hf/llava-onevision-qwen2-0.5b-ov-hf`, etc.
- -
- ✅︎ - ✅︎
- -
* - :code:`MiniCPMV` * - `MiniCPMV`
- MiniCPM-V - MiniCPM-V
- T + I\ :sup:`E+` - T + I<sup>E+</sup>
- :code:`openbmb/MiniCPM-V-2` (see note), :code:`openbmb/MiniCPM-Llama3-V-2_5`, :code:`openbmb/MiniCPM-V-2_6`, etc. - `openbmb/MiniCPM-V-2` (see note), `openbmb/MiniCPM-Llama3-V-2_5`, `openbmb/MiniCPM-V-2_6`, etc.
- ✅︎ - ✅︎
- ✅︎ - ✅︎
- -
* - :code:`MllamaForConditionalGeneration` * - `MllamaForConditionalGeneration`
- Llama 3.2 - Llama 3.2
- T + I\ :sup:`+` - T + I<sup>+</sup>
- :code:`meta-llama/Llama-3.2-90B-Vision-Instruct`, :code:`meta-llama/Llama-3.2-11B-Vision`, etc. - `meta-llama/Llama-3.2-90B-Vision-Instruct`, `meta-llama/Llama-3.2-11B-Vision`, etc.
- -
- -
- -
* - :code:`MolmoForCausalLM` * - `MolmoForCausalLM`
- Molmo - Molmo
- T + I - T + I
- :code:`allenai/Molmo-7B-D-0924`, :code:`allenai/Molmo-72B-0924`, etc. - `allenai/Molmo-7B-D-0924`, `allenai/Molmo-72B-0924`, etc.
- -
- ✅︎ - ✅︎
- ✅︎ - ✅︎
* - :code:`NVLM_D_Model` * - `NVLM_D_Model`
- NVLM-D 1.0 - NVLM-D 1.0
- T + I\ :sup:`E+` - T + I<sup>E+</sup>
- :code:`nvidia/NVLM-D-72B`, etc. - `nvidia/NVLM-D-72B`, etc.
- -
- ✅︎ - ✅︎
- ✅︎ - ✅︎
* - :code:`PaliGemmaForConditionalGeneration` * - `PaliGemmaForConditionalGeneration`
- PaliGemma, PaliGemma 2 - PaliGemma, PaliGemma 2
- T + I\ :sup:`E` - T + I<sup>E</sup>
- :code:`google/paligemma-3b-pt-224`, :code:`google/paligemma-3b-mix-224`, :code:`google/paligemma2-3b-ft-docci-448`, etc. - `google/paligemma-3b-pt-224`, `google/paligemma-3b-mix-224`, `google/paligemma2-3b-ft-docci-448`, etc.
- -
- ✅︎ - ✅︎
- -
* - :code:`Phi3VForCausalLM` * - `Phi3VForCausalLM`
- Phi-3-Vision, Phi-3.5-Vision - Phi-3-Vision, Phi-3.5-Vision
- T + I\ :sup:`E+` - T + I<sup>E+</sup>
- :code:`microsoft/Phi-3-vision-128k-instruct`, :code:`microsoft/Phi-3.5-vision-instruct` etc. - `microsoft/Phi-3-vision-128k-instruct`, `microsoft/Phi-3.5-vision-instruct` etc.
- -
- ✅︎ - ✅︎
- ✅︎ - ✅︎
* - :code:`PixtralForConditionalGeneration` * - `PixtralForConditionalGeneration`
- Pixtral - Pixtral
- T + I\ :sup:`+` - T + I<sup>+</sup>
- :code:`mistralai/Pixtral-12B-2409`, :code:`mistral-community/pixtral-12b` etc. - `mistralai/Pixtral-12B-2409`, `mistral-community/pixtral-12b` etc.
- -
- ✅︎ - ✅︎
- ✅︎ - ✅︎
* - :code:`QWenLMHeadModel` * - `QWenLMHeadModel`
- Qwen-VL - Qwen-VL
- T + I\ :sup:`E+` - T + I<sup>E+</sup>
- :code:`Qwen/Qwen-VL`, :code:`Qwen/Qwen-VL-Chat`, etc. - `Qwen/Qwen-VL`, `Qwen/Qwen-VL-Chat`, etc.
- ✅︎ - ✅︎
- ✅︎ - ✅︎
- -
* - :code:`Qwen2AudioForConditionalGeneration` * - `Qwen2AudioForConditionalGeneration`
- Qwen2-Audio - Qwen2-Audio
- T + A\ :sup:`+` - T + A<sup>+</sup>
- :code:`Qwen/Qwen2-Audio-7B-Instruct` - `Qwen/Qwen2-Audio-7B-Instruct`
- -
- ✅︎ - ✅︎
- -
* - :code:`Qwen2VLForConditionalGeneration` * - `Qwen2VLForConditionalGeneration`
- Qwen2-VL - Qwen2-VL
- T + I\ :sup:`E+` + V\ :sup:`E+` - T + I<sup>E+</sup> + V<sup>E+</sup>
- :code:`Qwen/QVQ-72B-Preview`, :code:`Qwen/Qwen2-VL-7B-Instruct`, :code:`Qwen/Qwen2-VL-72B-Instruct`, etc. - `Qwen/QVQ-72B-Preview`, `Qwen/Qwen2-VL-7B-Instruct`, `Qwen/Qwen2-VL-72B-Instruct`, etc.
- ✅︎ - ✅︎
- ✅︎ - ✅︎
- -
* - :code:`UltravoxModel` * - `UltravoxModel`
- Ultravox - Ultravox
- T + A\ :sup:`E+` - T + A<sup>E+</sup>
- :code:`fixie-ai/ultravox-v0_3` - `fixie-ai/ultravox-v0_3`
- -
- ✅︎ - ✅︎
- -
``` ```
```{eval-rst} <sup>E</sup> Pre-computed embeddings can be inputted for this modality.
:sup:`E` Pre-computed embeddings can be inputted for this modality. <sup>+</sup> Multiple items can be inputted per text prompt for this modality.
:sup:`+` Multiple items can be inputted per text prompt for this modality.
```
````{important} ````{important}
To enable multiple multi-modal items per text prompt, you have to set {code}`limit_mm_per_prompt` (offline inference) To enable multiple multi-modal items per text prompt, you have to set {code}`limit_mm_per_prompt` (offline inference)
@ -787,8 +778,7 @@ To get the best results, you should use pooling models that are specifically tra
The following table lists those that are tested in vLLM. The following table lists those that are tested in vLLM.
```{eval-rst} ```{list-table}
.. list-table::
:widths: 25 25 15 25 5 5 :widths: 25 25 15 25 5 5
:header-rows: 1 :header-rows: 1
@ -796,29 +786,29 @@ The following table lists those that are tested in vLLM.
- Models - Models
- Inputs - Inputs
- Example HF Models - Example HF Models
- :ref:`LoRA <lora-adapter>` - [LoRA](#lora-adapter)
- :ref:`PP <distributed-serving>` - [PP](#distributed-serving)
* - :code:`LlavaNextForConditionalGeneration` * - `LlavaNextForConditionalGeneration`
- LLaVA-NeXT-based - LLaVA-NeXT-based
- T / I - T / I
- :code:`royokong/e5-v` - `royokong/e5-v`
- -
- ✅︎ - ✅︎
* - :code:`Phi3VForCausalLM` * - `Phi3VForCausalLM`
- Phi-3-Vision-based - Phi-3-Vision-based
- T + I - T + I
- :code:`TIGER-Lab/VLM2Vec-Full` - `TIGER-Lab/VLM2Vec-Full`
- 🚧 - 🚧
- ✅︎ - ✅︎
* - :code:`Qwen2VLForConditionalGeneration` * - `Qwen2VLForConditionalGeneration`
- Qwen2-VL-based - Qwen2-VL-based
- T + I - T + I
- :code:`MrLight/dse-qwen2-2b-mrl-v1` - `MrLight/dse-qwen2-2b-mrl-v1`
- -
- ✅︎ - ✅︎
``` ```
______________________________________________________________________ _________________
# Model Support Policy # Model Support Policy

View File

@ -4,8 +4,7 @@
The table below shows the compatibility of various quantization implementations with different hardware platforms in vLLM: The table below shows the compatibility of various quantization implementations with different hardware platforms in vLLM:
```{eval-rst} ```{list-table}
.. list-table::
:header-rows: 1 :header-rows: 1
:widths: 20 8 8 8 8 8 8 8 8 8 8 :widths: 20 8 8 8 8 8 8 8 8 8 8

View File

@ -43,8 +43,7 @@ chart **including persistent volumes** and deletes the release.
## Values ## Values
```{eval-rst} ```{list-table}
.. list-table:: Values
:widths: 25 25 25 25 :widths: 25 25 25 25
:header-rows: 1 :header-rows: 1