[Doc] Convert list tables to MyST (#11594)

Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2024-12-29 15:56:22 +08:00 · 2024-12-29 15:56:22 +08:00 · 32b4c63f02
commit 32b4c63f02
parent 4fb8e329fd
6 changed files with 951 additions and 965 deletions
--- a/docs/source/getting_started/debugging.md
+++ b/docs/source/getting_started/debugging.md
@ -197,4 +197,4 @@ if __name__ == '__main__':
 ## Known Issues
 - In `v0.5.2`, `v0.5.3`, and `v0.5.3.post1`, there is a bug caused by [zmq](https://github.com/zeromq/pyzmq/issues/2000) , which can occasionally cause vLLM to hang depending on the machine configuration. The solution is to upgrade to the latest version of `vllm` to include the [fix](gh-pr:6759).
- To circumvent a NCCL [bug](https://github.com/NVIDIA/nccl/issues/1234) , all vLLM processes will set an environment variable ``NCCL_CUMEM_ENABLE=0`` to disable NCCL's ``cuMem`` allocator. It does not affect performance but only gives memory benefits. When external processes want to set up a NCCL connection with vLLM's processes, they should also set this environment variable, otherwise, inconsistent environment setup will cause NCCL to hang or crash, as observed in the [RLHF integration](https://github.com/OpenRLHF/OpenRLHF/pull/604) and the [discussion](gh-issue:5723#issuecomment-2554389656) .
+- To circumvent a NCCL [bug](https://github.com/NVIDIA/nccl/issues/1234) , all vLLM processes will set an environment variable `NCCL_CUMEM_ENABLE=0` to disable NCCL's `cuMem` allocator. It does not affect performance but only gives memory benefits. When external processes want to set up a NCCL connection with vLLM's processes, they should also set this environment variable, otherwise, inconsistent environment setup will cause NCCL to hang or crash, as observed in the [RLHF integration](https://github.com/OpenRLHF/OpenRLHF/pull/604) and the [discussion](gh-issue:5723#issuecomment-2554389656) .
--- a/docs/source/getting_started/gaudi-installation.md
+++ b/docs/source/getting_started/gaudi-installation.md
@ -141,13 +141,12 @@ Gaudi2 devices. Configurations that are not listed may or may not work.
 Currently in vLLM for HPU we support four execution modes, depending on selected HPU PyTorch Bridge backend (via `PT_HPU_LAZY_MODE` environment variable), and `--enforce-eager` flag.
-```{eval-rst}
+```{list-table} vLLM execution modes
 .. list-table:: vLLM execution modes
 :widths: 25 25 50
 :header-rows: 1
-   * - ``PT_HPU_LAZY_MODE``
+* - `PT_HPU_LAZY_MODE`
-     - ``enforce_eager``
+  - `enforce_eager`
  - execution mode
 * - 0
  - 0
--- a/docs/source/getting_started/tpu-installation.md
+++ b/docs/source/getting_started/tpu-installation.md
@ -68,8 +68,7 @@ gcloud alpha compute tpus queued-resources create QUEUED_RESOURCE_ID \
 --service-account SERVICE_ACCOUNT
 ```
-```{eval-rst}
+```{list-table} Parameter descriptions
 .. list-table:: Parameter descriptions
 :header-rows: 1
 * - Parameter name
--- a/docs/source/models/supported_models.md
+++ b/docs/source/models/supported_models.md
@ -72,289 +72,288 @@ See [this page](#generative-models) for more information on how to use generativ
 #### Text Generation (`--task generate`)
-```{eval-rst}
+```{list-table}
 .. list-table::
 :widths: 25 25 50 5 5
 :header-rows: 1
 * - Architecture
  - Models
  - Example HF Models
-    - :ref:`LoRA <lora-adapter>`
+  - [LoRA](#lora-adapter)
-    - :ref:`PP <distributed-serving>`
+  - [PP](#distributed-serving)
-  * - :code:`AquilaForCausalLM`
+* - `AquilaForCausalLM`
  - Aquila, Aquila2
-    - :code:`BAAI/Aquila-7B`, :code:`BAAI/AquilaChat-7B`, etc.
+  - `BAAI/Aquila-7B`, `BAAI/AquilaChat-7B`, etc.
  - ✅︎
  - ✅︎
-  * - :code:`ArcticForCausalLM`
+* - `ArcticForCausalLM`
  - Arctic
-    - :code:`Snowflake/snowflake-arctic-base`, :code:`Snowflake/snowflake-arctic-instruct`, etc.
+  - `Snowflake/snowflake-arctic-base`, `Snowflake/snowflake-arctic-instruct`, etc.
  -
  - ✅︎
-  * - :code:`BaiChuanForCausalLM`
+* - `BaiChuanForCausalLM`
  - Baichuan2, Baichuan
-    - :code:`baichuan-inc/Baichuan2-13B-Chat`, :code:`baichuan-inc/Baichuan-7B`, etc.
+  - `baichuan-inc/Baichuan2-13B-Chat`, `baichuan-inc/Baichuan-7B`, etc.
  - ✅︎
  - ✅︎
-  * - :code:`BloomForCausalLM`
+* - `BloomForCausalLM`
  - BLOOM, BLOOMZ, BLOOMChat
-    - :code:`bigscience/bloom`, :code:`bigscience/bloomz`, etc.
+  - `bigscience/bloom`, `bigscience/bloomz`, etc.
  -
  - ✅︎
-  * - :code:`BartForConditionalGeneration`
+* - `BartForConditionalGeneration`
  - BART
-    - :code:`facebook/bart-base`, :code:`facebook/bart-large-cnn`, etc.
+  - `facebook/bart-base`, `facebook/bart-large-cnn`, etc.
  -
  -
-  * - :code:`ChatGLMModel`
+* - `ChatGLMModel`
  - ChatGLM
-    - :code:`THUDM/chatglm2-6b`, :code:`THUDM/chatglm3-6b`, etc.
+  - `THUDM/chatglm2-6b`, `THUDM/chatglm3-6b`, etc.
  - ✅︎
  - ✅︎
-  * - :code:`CohereForCausalLM`, :code:`Cohere2ForCausalLM`
+* - `CohereForCausalLM`, `Cohere2ForCausalLM`
  - Command-R
-    - :code:`CohereForAI/c4ai-command-r-v01`, :code:`CohereForAI/c4ai-command-r7b-12-2024`, etc.
+  - `CohereForAI/c4ai-command-r-v01`, `CohereForAI/c4ai-command-r7b-12-2024`, etc.
  - ✅︎
  - ✅︎
-  * - :code:`DbrxForCausalLM`
+* - `DbrxForCausalLM`
  - DBRX
-    - :code:`databricks/dbrx-base`, :code:`databricks/dbrx-instruct`, etc.
+  - `databricks/dbrx-base`, `databricks/dbrx-instruct`, etc.
  -
  - ✅︎
-  * - :code:`DeciLMForCausalLM`
+* - `DeciLMForCausalLM`
  - DeciLM
-    - :code:`Deci/DeciLM-7B`, :code:`Deci/DeciLM-7B-instruct`, etc.
+  - `Deci/DeciLM-7B`, `Deci/DeciLM-7B-instruct`, etc.
  -
  - ✅︎
-  * - :code:`DeepseekForCausalLM`
+* - `DeepseekForCausalLM`
  - DeepSeek
-    - :code:`deepseek-ai/deepseek-llm-67b-base`, :code:`deepseek-ai/deepseek-llm-7b-chat` etc.
+  - `deepseek-ai/deepseek-llm-67b-base`, `deepseek-ai/deepseek-llm-7b-chat` etc.
  -
  - ✅︎
-  * - :code:`DeepseekV2ForCausalLM`
+* - `DeepseekV2ForCausalLM`
  - DeepSeek-V2
-    - :code:`deepseek-ai/DeepSeek-V2`, :code:`deepseek-ai/DeepSeek-V2-Chat` etc.
+  - `deepseek-ai/DeepSeek-V2`, `deepseek-ai/DeepSeek-V2-Chat` etc.
  -
  - ✅︎
-  * - :code:`DeepseekV3ForCausalLM`
+* - `DeepseekV3ForCausalLM`
  - DeepSeek-V3
-    - :code:`deepseek-ai/DeepSeek-V3-Base`, :code:`deepseek-ai/DeepSeek-V3` etc.
+  - `deepseek-ai/DeepSeek-V3-Base`, `deepseek-ai/DeepSeek-V3` etc.
  -
  - ✅︎
-  * - :code:`ExaoneForCausalLM`
+* - `ExaoneForCausalLM`
  - EXAONE-3
-    - :code:`LGAI-EXAONE/EXAONE-3.0-7.8B-Instruct`, etc.
+  - `LGAI-EXAONE/EXAONE-3.0-7.8B-Instruct`, etc.
  - ✅︎
  - ✅︎
-  * - :code:`FalconForCausalLM`
+* - `FalconForCausalLM`
  - Falcon
-    - :code:`tiiuae/falcon-7b`, :code:`tiiuae/falcon-40b`, :code:`tiiuae/falcon-rw-7b`, etc.
+  - `tiiuae/falcon-7b`, `tiiuae/falcon-40b`, `tiiuae/falcon-rw-7b`, etc.
  -
  - ✅︎
-  * - :code:`FalconMambaForCausalLM`
+* - `FalconMambaForCausalLM`
  - FalconMamba
-    - :code:`tiiuae/falcon-mamba-7b`, :code:`tiiuae/falcon-mamba-7b-instruct`, etc.
+  - `tiiuae/falcon-mamba-7b`, `tiiuae/falcon-mamba-7b-instruct`, etc.
  - ✅︎
  - ✅︎
-  * - :code:`GemmaForCausalLM`
+* - `GemmaForCausalLM`
  - Gemma
-    - :code:`google/gemma-2b`, :code:`google/gemma-7b`, etc.
+  - `google/gemma-2b`, `google/gemma-7b`, etc.
  - ✅︎
  - ✅︎
-  * - :code:`Gemma2ForCausalLM`
+* - `Gemma2ForCausalLM`
  - Gemma2
-    - :code:`google/gemma-2-9b`, :code:`google/gemma-2-27b`, etc.
+  - `google/gemma-2-9b`, `google/gemma-2-27b`, etc.
  - ✅︎
  - ✅︎
-  * - :code:`GlmForCausalLM`
+* - `GlmForCausalLM`
  - GLM-4
-    - :code:`THUDM/glm-4-9b-chat-hf`, etc.
+  - `THUDM/glm-4-9b-chat-hf`, etc.
  - ✅︎
  - ✅︎
-  * - :code:`GPT2LMHeadModel`
+* - `GPT2LMHeadModel`
  - GPT-2
-    - :code:`gpt2`, :code:`gpt2-xl`, etc.
+  - `gpt2`, `gpt2-xl`, etc.
  -
  - ✅︎
-  * - :code:`GPTBigCodeForCausalLM`
+* - `GPTBigCodeForCausalLM`
  - StarCoder, SantaCoder, WizardCoder
-    - :code:`bigcode/starcoder`, :code:`bigcode/gpt_bigcode-santacoder`, :code:`WizardLM/WizardCoder-15B-V1.0`, etc.
+  - `bigcode/starcoder`, `bigcode/gpt_bigcode-santacoder`, `WizardLM/WizardCoder-15B-V1.0`, etc.
  - ✅︎
  - ✅︎
-  * - :code:`GPTJForCausalLM`
+* - `GPTJForCausalLM`
  - GPT-J
-    - :code:`EleutherAI/gpt-j-6b`, :code:`nomic-ai/gpt4all-j`, etc.
+  - `EleutherAI/gpt-j-6b`, `nomic-ai/gpt4all-j`, etc.
  -
  - ✅︎
-  * - :code:`GPTNeoXForCausalLM`
+* - `GPTNeoXForCausalLM`
  - GPT-NeoX, Pythia, OpenAssistant, Dolly V2, StableLM
-    - :code:`EleutherAI/gpt-neox-20b`, :code:`EleutherAI/pythia-12b`, :code:`OpenAssistant/oasst-sft-4-pythia-12b-epoch-3.5`, :code:`databricks/dolly-v2-12b`, :code:`stabilityai/stablelm-tuned-alpha-7b`, etc.
+  - `EleutherAI/gpt-neox-20b`, `EleutherAI/pythia-12b`, `OpenAssistant/oasst-sft-4-pythia-12b-epoch-3.5`, `databricks/dolly-v2-12b`, `stabilityai/stablelm-tuned-alpha-7b`, etc.
  -
  - ✅︎
-  * - :code:`GraniteForCausalLM`
+* - `GraniteForCausalLM`
  - Granite 3.0, Granite 3.1, PowerLM
-    - :code:`ibm-granite/granite-3.0-2b-base`, :code:`ibm-granite/granite-3.1-8b-instruct`, :code:`ibm/PowerLM-3b`, etc.
+  - `ibm-granite/granite-3.0-2b-base`, `ibm-granite/granite-3.1-8b-instruct`, `ibm/PowerLM-3b`, etc.
  - ✅︎
  - ✅︎
-  * - :code:`GraniteMoeForCausalLM`
+* - `GraniteMoeForCausalLM`
  - Granite 3.0 MoE, PowerMoE
-    - :code:`ibm-granite/granite-3.0-1b-a400m-base`, :code:`ibm-granite/granite-3.0-3b-a800m-instruct`, :code:`ibm/PowerMoE-3b`, etc.
+  - `ibm-granite/granite-3.0-1b-a400m-base`, `ibm-granite/granite-3.0-3b-a800m-instruct`, `ibm/PowerMoE-3b`, etc.
  - ✅︎
  - ✅︎
-  * - :code:`GritLM`
+* - `GritLM`
  - GritLM
-    - :code:`parasail-ai/GritLM-7B-vllm`.
+  - `parasail-ai/GritLM-7B-vllm`.
  - ✅︎
  - ✅︎
-  * - :code:`InternLMForCausalLM`
+* - `InternLMForCausalLM`
  - InternLM
-    - :code:`internlm/internlm-7b`, :code:`internlm/internlm-chat-7b`, etc.
+  - `internlm/internlm-7b`, `internlm/internlm-chat-7b`, etc.
  - ✅︎
  - ✅︎
-  * - :code:`InternLM2ForCausalLM`
+* - `InternLM2ForCausalLM`
  - InternLM2
-    - :code:`internlm/internlm2-7b`, :code:`internlm/internlm2-chat-7b`, etc.
+  - `internlm/internlm2-7b`, `internlm/internlm2-chat-7b`, etc.
  - ✅︎
  - ✅︎
-  * - :code:`JAISLMHeadModel`
+* - `JAISLMHeadModel`
  - Jais
-    - :code:`inceptionai/jais-13b`, :code:`inceptionai/jais-13b-chat`, :code:`inceptionai/jais-30b-v3`, :code:`inceptionai/jais-30b-chat-v3`, etc.
+  - `inceptionai/jais-13b`, `inceptionai/jais-13b-chat`, `inceptionai/jais-30b-v3`, `inceptionai/jais-30b-chat-v3`, etc.
  -
  - ✅︎
-  * - :code:`JambaForCausalLM`
+* - `JambaForCausalLM`
  - Jamba
-    - :code:`ai21labs/AI21-Jamba-1.5-Large`, :code:`ai21labs/AI21-Jamba-1.5-Mini`, :code:`ai21labs/Jamba-v0.1`, etc.
+  - `ai21labs/AI21-Jamba-1.5-Large`, `ai21labs/AI21-Jamba-1.5-Mini`, `ai21labs/Jamba-v0.1`, etc.
  - ✅︎
  - ✅︎
-  * - :code:`LlamaForCausalLM`
+* - `LlamaForCausalLM`
  - Llama 3.1, Llama 3, Llama 2, LLaMA, Yi
-    - :code:`meta-llama/Meta-Llama-3.1-405B-Instruct`, :code:`meta-llama/Meta-Llama-3.1-70B`, :code:`meta-llama/Meta-Llama-3-70B-Instruct`, :code:`meta-llama/Llama-2-70b-hf`, :code:`01-ai/Yi-34B`, etc.
+  - `meta-llama/Meta-Llama-3.1-405B-Instruct`, `meta-llama/Meta-Llama-3.1-70B`, `meta-llama/Meta-Llama-3-70B-Instruct`, `meta-llama/Llama-2-70b-hf`, `01-ai/Yi-34B`, etc.
  - ✅︎
  - ✅︎
-  * - :code:`MambaForCausalLM`
+* - `MambaForCausalLM`
  - Mamba
-    - :code:`state-spaces/mamba-130m-hf`, :code:`state-spaces/mamba-790m-hf`, :code:`state-spaces/mamba-2.8b-hf`, etc.
+  - `state-spaces/mamba-130m-hf`, `state-spaces/mamba-790m-hf`, `state-spaces/mamba-2.8b-hf`, etc.
  -
  - ✅︎
-  * - :code:`MiniCPMForCausalLM`
+* - `MiniCPMForCausalLM`
  - MiniCPM
-    - :code:`openbmb/MiniCPM-2B-sft-bf16`, :code:`openbmb/MiniCPM-2B-dpo-bf16`, :code:`openbmb/MiniCPM-S-1B-sft`, etc.
+  - `openbmb/MiniCPM-2B-sft-bf16`, `openbmb/MiniCPM-2B-dpo-bf16`, `openbmb/MiniCPM-S-1B-sft`, etc.
  - ✅︎
  - ✅︎
-  * - :code:`MiniCPM3ForCausalLM`
+* - `MiniCPM3ForCausalLM`
  - MiniCPM3
-    - :code:`openbmb/MiniCPM3-4B`, etc.
+  - `openbmb/MiniCPM3-4B`, etc.
  - ✅︎
  - ✅︎
-  * - :code:`MistralForCausalLM`
+* - `MistralForCausalLM`
  - Mistral, Mistral-Instruct
-    - :code:`mistralai/Mistral-7B-v0.1`, :code:`mistralai/Mistral-7B-Instruct-v0.1`, etc.
+  - `mistralai/Mistral-7B-v0.1`, `mistralai/Mistral-7B-Instruct-v0.1`, etc.
  - ✅︎
  - ✅︎
-  * - :code:`MixtralForCausalLM`
+* - `MixtralForCausalLM`
  - Mixtral-8x7B, Mixtral-8x7B-Instruct
-    - :code:`mistralai/Mixtral-8x7B-v0.1`, :code:`mistralai/Mixtral-8x7B-Instruct-v0.1`, :code:`mistral-community/Mixtral-8x22B-v0.1`, etc.
+  - `mistralai/Mixtral-8x7B-v0.1`, `mistralai/Mixtral-8x7B-Instruct-v0.1`, `mistral-community/Mixtral-8x22B-v0.1`, etc.
  - ✅︎
  - ✅︎
-  * - :code:`MPTForCausalLM`
+* - `MPTForCausalLM`
  - MPT, MPT-Instruct, MPT-Chat, MPT-StoryWriter
-    - :code:`mosaicml/mpt-7b`, :code:`mosaicml/mpt-7b-storywriter`, :code:`mosaicml/mpt-30b`, etc.
+  - `mosaicml/mpt-7b`, `mosaicml/mpt-7b-storywriter`, `mosaicml/mpt-30b`, etc.
  -
  - ✅︎
-  * - :code:`NemotronForCausalLM`
+* - `NemotronForCausalLM`
  - Nemotron-3, Nemotron-4, Minitron
-    - :code:`nvidia/Minitron-8B-Base`, :code:`mgoin/Nemotron-4-340B-Base-hf-FP8`, etc.
+  - `nvidia/Minitron-8B-Base`, `mgoin/Nemotron-4-340B-Base-hf-FP8`, etc.
  - ✅︎
  - ✅︎
-  * - :code:`OLMoForCausalLM`
+* - `OLMoForCausalLM`
  - OLMo
-    - :code:`allenai/OLMo-1B-hf`, :code:`allenai/OLMo-7B-hf`, etc.
+  - `allenai/OLMo-1B-hf`, `allenai/OLMo-7B-hf`, etc.
  -
  - ✅︎
-  * - :code:`OLMo2ForCausalLM`
+* - `OLMo2ForCausalLM`
  - OLMo2
-    - :code:`allenai/OLMo2-7B-1124`, etc.
+  - `allenai/OLMo2-7B-1124`, etc.
  -
  - ✅︎
-  * - :code:`OLMoEForCausalLM`
+* - `OLMoEForCausalLM`
  - OLMoE
-    - :code:`allenai/OLMoE-1B-7B-0924`, :code:`allenai/OLMoE-1B-7B-0924-Instruct`, etc.
+  - `allenai/OLMoE-1B-7B-0924`, `allenai/OLMoE-1B-7B-0924-Instruct`, etc.
  - ✅︎
  - ✅︎
-  * - :code:`OPTForCausalLM`
+* - `OPTForCausalLM`
  - OPT, OPT-IML
-    - :code:`facebook/opt-66b`, :code:`facebook/opt-iml-max-30b`, etc.
+  - `facebook/opt-66b`, `facebook/opt-iml-max-30b`, etc.
  -
  - ✅︎
-  * - :code:`OrionForCausalLM`
+* - `OrionForCausalLM`
  - Orion
-    - :code:`OrionStarAI/Orion-14B-Base`, :code:`OrionStarAI/Orion-14B-Chat`, etc.
+  - `OrionStarAI/Orion-14B-Base`, `OrionStarAI/Orion-14B-Chat`, etc.
  -
  - ✅︎
-  * - :code:`PhiForCausalLM`
+* - `PhiForCausalLM`
  - Phi
-    - :code:`microsoft/phi-1_5`, :code:`microsoft/phi-2`, etc.
+  - `microsoft/phi-1_5`, `microsoft/phi-2`, etc.
  - ✅︎
  - ✅︎
-  * - :code:`Phi3ForCausalLM`
+* - `Phi3ForCausalLM`
  - Phi-3
-    - :code:`microsoft/Phi-3-mini-4k-instruct`, :code:`microsoft/Phi-3-mini-128k-instruct`, :code:`microsoft/Phi-3-medium-128k-instruct`, etc.
+  - `microsoft/Phi-3-mini-4k-instruct`, `microsoft/Phi-3-mini-128k-instruct`, `microsoft/Phi-3-medium-128k-instruct`, etc.
  - ✅︎
  - ✅︎
-  * - :code:`Phi3SmallForCausalLM`
+* - `Phi3SmallForCausalLM`
  - Phi-3-Small
-    - :code:`microsoft/Phi-3-small-8k-instruct`, :code:`microsoft/Phi-3-small-128k-instruct`, etc.
+  - `microsoft/Phi-3-small-8k-instruct`, `microsoft/Phi-3-small-128k-instruct`, etc.
  -
  - ✅︎
-  * - :code:`PhiMoEForCausalLM`
+* - `PhiMoEForCausalLM`
  - Phi-3.5-MoE
-    - :code:`microsoft/Phi-3.5-MoE-instruct`, etc.
+  - `microsoft/Phi-3.5-MoE-instruct`, etc.
  - ✅︎
  - ✅︎
-  * - :code:`PersimmonForCausalLM`
+* - `PersimmonForCausalLM`
  - Persimmon
-    - :code:`adept/persimmon-8b-base`, :code:`adept/persimmon-8b-chat`, etc.
+  - `adept/persimmon-8b-base`, `adept/persimmon-8b-chat`, etc.
  -
  - ✅︎
-  * - :code:`QWenLMHeadModel`
+* - `QWenLMHeadModel`
  - Qwen
-    - :code:`Qwen/Qwen-7B`, :code:`Qwen/Qwen-7B-Chat`, etc.
+  - `Qwen/Qwen-7B`, `Qwen/Qwen-7B-Chat`, etc.
  - ✅︎
  - ✅︎
-  * - :code:`Qwen2ForCausalLM`
+* - `Qwen2ForCausalLM`
  - Qwen2
-    - :code:`Qwen/QwQ-32B-Preview`, :code:`Qwen/Qwen2-7B-Instruct`, :code:`Qwen/Qwen2-7B`, etc.
+  - `Qwen/QwQ-32B-Preview`, `Qwen/Qwen2-7B-Instruct`, `Qwen/Qwen2-7B`, etc.
  - ✅︎
  - ✅︎
-  * - :code:`Qwen2MoeForCausalLM`
+* - `Qwen2MoeForCausalLM`
  - Qwen2MoE
-    - :code:`Qwen/Qwen1.5-MoE-A2.7B`, :code:`Qwen/Qwen1.5-MoE-A2.7B-Chat`, etc.
+  - `Qwen/Qwen1.5-MoE-A2.7B`, `Qwen/Qwen1.5-MoE-A2.7B-Chat`, etc.
  -
  - ✅︎
-  * - :code:`StableLmForCausalLM`
+* - `StableLmForCausalLM`
  - StableLM
-    - :code:`stabilityai/stablelm-3b-4e1t`, :code:`stabilityai/stablelm-base-alpha-7b-v2`, etc.
+  - `stabilityai/stablelm-3b-4e1t`, `stabilityai/stablelm-base-alpha-7b-v2`, etc.
  -
  - ✅︎
-  * - :code:`Starcoder2ForCausalLM`
+* - `Starcoder2ForCausalLM`
  - Starcoder2
-    - :code:`bigcode/starcoder2-3b`, :code:`bigcode/starcoder2-7b`, :code:`bigcode/starcoder2-15b`, etc.
+  - `bigcode/starcoder2-3b`, `bigcode/starcoder2-7b`, `bigcode/starcoder2-15b`, etc.
  -
  - ✅︎
-  * - :code:`SolarForCausalLM`
+* - `SolarForCausalLM`
  - Solar Pro
-    - :code:`upstage/solar-pro-preview-instruct`, etc.
+  - `upstage/solar-pro-preview-instruct`, etc.
  - ✅︎
  - ✅︎
-  * - :code:`TeleChat2ForCausalLM`
+* - `TeleChat2ForCausalLM`
  - TeleChat2
-    - :code:`TeleAI/TeleChat2-3B`, :code:`TeleAI/TeleChat2-7B`, :code:`TeleAI/TeleChat2-35B`, etc.
+  - `TeleAI/TeleChat2-3B`, `TeleAI/TeleChat2-7B`, `TeleAI/TeleChat2-35B`, etc.
  - ✅︎
  - ✅︎
-  * - :code:`XverseForCausalLM`
+* - `XverseForCausalLM`
  - XVERSE
-    - :code:`xverse/XVERSE-7B-Chat`, :code:`xverse/XVERSE-13B-Chat`, :code:`xverse/XVERSE-65B-Chat`, etc.
+  - `xverse/XVERSE-7B-Chat`, `xverse/XVERSE-13B-Chat`, `xverse/XVERSE-65B-Chat`, etc.
  - ✅︎
  - ✅︎
 ```
@ -374,49 +373,48 @@ you should explicitly specify the task type to ensure that the model is used in
 #### Text Embedding (`--task embed`)
-```{eval-rst}
+```{list-table}
 .. list-table::
 :widths: 25 25 50 5 5
 :header-rows: 1
 * - Architecture
  - Models
  - Example HF Models
-    - :ref:`LoRA <lora-adapter>`
+  - [LoRA](#lora-adapter)
-    - :ref:`PP <distributed-serving>`
+  - [PP](#distributed-serving)
-  * - :code:`BertModel`
+* - `BertModel`
  - BERT-based
-    - :code:`BAAI/bge-base-en-v1.5`, etc.
+  - `BAAI/bge-base-en-v1.5`, etc.
  -
  -
-  * - :code:`Gemma2Model`
+* - `Gemma2Model`
  - Gemma2-based
-    - :code:`BAAI/bge-multilingual-gemma2`, etc.
+  - `BAAI/bge-multilingual-gemma2`, etc.
  -
  - ✅︎
-  * - :code:`GritLM`
+* - `GritLM`
  - GritLM
-    - :code:`parasail-ai/GritLM-7B-vllm`.
+  - `parasail-ai/GritLM-7B-vllm`.
  - ✅︎
  - ✅︎
-  * - :code:`LlamaModel`, :code:`LlamaForCausalLM`, :code:`MistralModel`, etc.
+* - `LlamaModel`, `LlamaForCausalLM`, `MistralModel`, etc.
  - Llama-based
-    - :code:`intfloat/e5-mistral-7b-instruct`, etc.
+  - `intfloat/e5-mistral-7b-instruct`, etc.
  - ✅︎
  - ✅︎
-  * - :code:`Qwen2Model`, :code:`Qwen2ForCausalLM`
+* - `Qwen2Model`, `Qwen2ForCausalLM`
  - Qwen2-based
-    - :code:`ssmits/Qwen2-7B-Instruct-embed-base` (see note), :code:`Alibaba-NLP/gte-Qwen2-7B-instruct` (see note), etc.
+  - `ssmits/Qwen2-7B-Instruct-embed-base` (see note), `Alibaba-NLP/gte-Qwen2-7B-instruct` (see note), etc.
  - ✅︎
  - ✅︎
-  * - :code:`RobertaModel`, :code:`RobertaForMaskedLM`
+* - `RobertaModel`, `RobertaForMaskedLM`
  - RoBERTa-based
-    - :code:`sentence-transformers/all-roberta-large-v1`, :code:`sentence-transformers/all-roberta-large-v1`, etc.
+  - `sentence-transformers/all-roberta-large-v1`, `sentence-transformers/all-roberta-large-v1`, etc.
  -
  -
-  * - :code:`XLMRobertaModel`
+* - `XLMRobertaModel`
  - XLM-RoBERTa-based
-    - :code:`intfloat/multilingual-e5-large`, etc.
+  - `intfloat/multilingual-e5-large`, etc.
  -
  -
 ```
@ -440,29 +438,28 @@ of the whole prompt are extracted from the normalized hidden state corresponding
 #### Reward Modeling (`--task reward`)
-```{eval-rst}
+```{list-table}
 .. list-table::
 :widths: 25 25 50 5 5
 :header-rows: 1
 * - Architecture
  - Models
  - Example HF Models
-    - :ref:`LoRA <lora-adapter>`
+  - [LoRA](#lora-adapter)
-    - :ref:`PP <distributed-serving>`
+  - [PP](#distributed-serving)
-  * - :code:`InternLM2ForRewardModel`
+* - `InternLM2ForRewardModel`
  - InternLM2-based
-    - :code:`internlm/internlm2-1_8b-reward`, :code:`internlm/internlm2-7b-reward`, etc.
+  - `internlm/internlm2-1_8b-reward`, `internlm/internlm2-7b-reward`, etc.
  - ✅︎
  - ✅︎
-  * - :code:`LlamaForCausalLM`
+* - `LlamaForCausalLM`
  - Llama-based
-    - :code:`peiyi9979/math-shepherd-mistral-7b-prm`, etc.
+  - `peiyi9979/math-shepherd-mistral-7b-prm`, etc.
  - ✅︎
  - ✅︎
-  * - :code:`Qwen2ForRewardModel`
+* - `Qwen2ForRewardModel`
  - Qwen2-based
-    - :code:`Qwen/Qwen2.5-Math-RM-72B`, etc.
+  - `Qwen/Qwen2.5-Math-RM-72B`, etc.
  - ✅︎
  - ✅︎
 ```
@ -477,24 +474,23 @@ e.g.: {code}`--override-pooler-config '{"pooling_type": "STEP", "step_tag_id": 1
 #### Classification (`--task classify`)
-```{eval-rst}
+```{list-table}
 .. list-table::
 :widths: 25 25 50 5 5
 :header-rows: 1
 * - Architecture
  - Models
  - Example HF Models
-    - :ref:`LoRA <lora-adapter>`
+  - [LoRA](#lora-adapter)
-    - :ref:`PP <distributed-serving>`
+  - [PP](#distributed-serving)
-  * - :code:`JambaForSequenceClassification`
+* - `JambaForSequenceClassification`
  - Jamba
-    - :code:`ai21labs/Jamba-tiny-reward-dev`, etc.
+  - `ai21labs/Jamba-tiny-reward-dev`, etc.
  - ✅︎
  - ✅︎
-  * - :code:`Qwen2ForSequenceClassification`
+* - `Qwen2ForSequenceClassification`
  - Qwen2-based
-    - :code:`jason9693/Qwen2.5-1.5B-apeach`, etc.
+  - `jason9693/Qwen2.5-1.5B-apeach`, etc.
  - ✅︎
  - ✅︎
 ```
@ -504,29 +500,28 @@ If your model is not in the above list, we will try to automatically convert the
 #### Sentence Pair Scoring (`--task score`)
-```{eval-rst}
+```{list-table}
 .. list-table::
 :widths: 25 25 50 5 5
 :header-rows: 1
 * - Architecture
  - Models
  - Example HF Models
-    - :ref:`LoRA <lora-adapter>`
+  - [LoRA](#lora-adapter)
-    - :ref:`PP <distributed-serving>`
+  - [PP](#distributed-serving)
-  * - :code:`BertForSequenceClassification`
+* - `BertForSequenceClassification`
  - BERT-based
-    - :code:`cross-encoder/ms-marco-MiniLM-L-6-v2`, etc.
+  - `cross-encoder/ms-marco-MiniLM-L-6-v2`, etc.
  -
  -
-  * - :code:`RobertaForSequenceClassification`
+* - `RobertaForSequenceClassification`
  - RoBERTa-based
-    - :code:`cross-encoder/quora-roberta-base`, etc.
+  - `cross-encoder/quora-roberta-base`, etc.
  -
  -
-  * - :code:`XLMRobertaForSequenceClassification`
+* - `XLMRobertaForSequenceClassification`
  - XLM-RoBERTa-based
-    - :code:`BAAI/bge-reranker-v2-m3`, etc.
+  - `BAAI/bge-reranker-v2-m3`, etc.
  -
  -
 ```
@ -558,8 +553,7 @@ See [this page](#generative-models) for more information on how to use generativ
 #### Text Generation (`--task generate`)
-```{eval-rst}
+```{list-table}
 .. list-table::
 :widths: 25 25 15 20 5 5 5
 :header-rows: 1
@ -567,177 +561,174 @@ See [this page](#generative-models) for more information on how to use generativ
  - Models
  - Inputs
  - Example HF Models
-    - :ref:`LoRA <lora-adapter>`
+  - [LoRA](#lora-adapter)
-    - :ref:`PP <distributed-serving>`
+  - [PP](#distributed-serving)
-    - V1
+  - [V1](gh-issue:8779)
-  * - :code:`AriaForConditionalGeneration`
+* - `AriaForConditionalGeneration`
  - Aria
  - T + I
-    - :code:`rhymes-ai/Aria`
+  - `rhymes-ai/Aria`
  -
  - ✅︎
  -
-  * - :code:`Blip2ForConditionalGeneration`
+* - `Blip2ForConditionalGeneration`
  - BLIP-2
-    - T + I\ :sup:`E`
+  - T + I<sup>E</sup>
-    - :code:`Salesforce/blip2-opt-2.7b`, :code:`Salesforce/blip2-opt-6.7b`, etc.
+  - `Salesforce/blip2-opt-2.7b`, `Salesforce/blip2-opt-6.7b`, etc.
  -
  - ✅︎
  -
-  * - :code:`ChameleonForConditionalGeneration`
+* - `ChameleonForConditionalGeneration`
  - Chameleon
  - T + I
-    - :code:`facebook/chameleon-7b` etc.
+  - `facebook/chameleon-7b` etc.
  -
  - ✅︎
  -
-  * - :code:`FuyuForCausalLM`
+* - `FuyuForCausalLM`
  - Fuyu
  - T + I
-    - :code:`adept/fuyu-8b` etc.
+  - `adept/fuyu-8b` etc.
  -
  - ✅︎
  -
-  * - :code:`ChatGLMModel`
+* - `ChatGLMModel`
  - GLM-4V
  - T + I
-    - :code:`THUDM/glm-4v-9b` etc.
+  - `THUDM/glm-4v-9b` etc.
  - ✅︎
  - ✅︎
  -
-  * - :code:`H2OVLChatModel`
+* - `H2OVLChatModel`
  - H2OVL
-    - T + I\ :sup:`E+`
+  - T + I<sup>E+</sup>
-    - :code:`h2oai/h2ovl-mississippi-800m`, :code:`h2oai/h2ovl-mississippi-2b`, etc.
+  - `h2oai/h2ovl-mississippi-800m`, `h2oai/h2ovl-mississippi-2b`, etc.
  -
  - ✅︎
  -
-  * - :code:`Idefics3ForConditionalGeneration`
+* - `Idefics3ForConditionalGeneration`
  - Idefics3
  - T + I
-    - :code:`HuggingFaceM4/Idefics3-8B-Llama3` etc.
+  - `HuggingFaceM4/Idefics3-8B-Llama3` etc.
  - ✅︎
  -
  -
-  * - :code:`InternVLChatModel`
+* - `InternVLChatModel`
  - InternVL 2.5, Mono-InternVL, InternVL 2.0
-    - T + I\ :sup:`E+`
+  - T + I<sup>E+</sup>
-    - :code:`OpenGVLab/InternVL2_5-4B`, :code:`OpenGVLab/Mono-InternVL-2B`, :code:`OpenGVLab/InternVL2-4B`, etc.
+  - `OpenGVLab/InternVL2_5-4B`, `OpenGVLab/Mono-InternVL-2B`, `OpenGVLab/InternVL2-4B`, etc.
  -
  - ✅︎
  - ✅︎
-  * - :code:`LlavaForConditionalGeneration`
+* - `LlavaForConditionalGeneration`
  - LLaVA-1.5
-    - T + I\ :sup:`E+`
+  - T + I<sup>E+</sup>
-    - :code:`llava-hf/llava-1.5-7b-hf`, :code:`TIGER-Lab/Mantis-8B-siglip-llama3` (see note), etc.
+  - `llava-hf/llava-1.5-7b-hf`, `TIGER-Lab/Mantis-8B-siglip-llama3` (see note), etc.
  -
  - ✅︎
  - ✅︎
-  * - :code:`LlavaNextForConditionalGeneration`
+* - `LlavaNextForConditionalGeneration`
  - LLaVA-NeXT
-    - T + I\ :sup:`E+`
+  - T + I<sup>E+</sup>
-    - :code:`llava-hf/llava-v1.6-mistral-7b-hf`, :code:`llava-hf/llava-v1.6-vicuna-7b-hf`, etc.
+  - `llava-hf/llava-v1.6-mistral-7b-hf`, `llava-hf/llava-v1.6-vicuna-7b-hf`, etc.
  -
  - ✅︎
  -
-  * - :code:`LlavaNextVideoForConditionalGeneration`
+* - `LlavaNextVideoForConditionalGeneration`
  - LLaVA-NeXT-Video
  - T + V
-    - :code:`llava-hf/LLaVA-NeXT-Video-7B-hf`, etc.
+  - `llava-hf/LLaVA-NeXT-Video-7B-hf`, etc.
  -
  - ✅︎
  -
-  * - :code:`LlavaOnevisionForConditionalGeneration`
+* - `LlavaOnevisionForConditionalGeneration`
  - LLaVA-Onevision
-    - T + I\ :sup:`+` + V\ :sup:`+`
+  - T + I<sup>+</sup> + V<sup>+</sup>
-    - :code:`llava-hf/llava-onevision-qwen2-7b-ov-hf`, :code:`llava-hf/llava-onevision-qwen2-0.5b-ov-hf`, etc.
+  - `llava-hf/llava-onevision-qwen2-7b-ov-hf`, `llava-hf/llava-onevision-qwen2-0.5b-ov-hf`, etc.
  -
  - ✅︎
  -
-  * - :code:`MiniCPMV`
+* - `MiniCPMV`
  - MiniCPM-V
-    - T + I\ :sup:`E+`
+  - T + I<sup>E+</sup>
-    - :code:`openbmb/MiniCPM-V-2` (see note), :code:`openbmb/MiniCPM-Llama3-V-2_5`, :code:`openbmb/MiniCPM-V-2_6`, etc.
+  - `openbmb/MiniCPM-V-2` (see note), `openbmb/MiniCPM-Llama3-V-2_5`, `openbmb/MiniCPM-V-2_6`, etc.
  - ✅︎
  - ✅︎
  -
-  * - :code:`MllamaForConditionalGeneration`
+* - `MllamaForConditionalGeneration`
  - Llama 3.2
-    - T + I\ :sup:`+`
+  - T + I<sup>+</sup>
-    - :code:`meta-llama/Llama-3.2-90B-Vision-Instruct`, :code:`meta-llama/Llama-3.2-11B-Vision`, etc.
+  - `meta-llama/Llama-3.2-90B-Vision-Instruct`, `meta-llama/Llama-3.2-11B-Vision`, etc.
  -
  -
  -
-  * - :code:`MolmoForCausalLM`
+* - `MolmoForCausalLM`
  - Molmo
  - T + I
-    - :code:`allenai/Molmo-7B-D-0924`, :code:`allenai/Molmo-72B-0924`, etc.
+  - `allenai/Molmo-7B-D-0924`, `allenai/Molmo-72B-0924`, etc.
  -
  - ✅︎
  - ✅︎
-  * - :code:`NVLM_D_Model`
+* - `NVLM_D_Model`
  - NVLM-D 1.0
-    - T + I\ :sup:`E+`
+  - T + I<sup>E+</sup>
-    - :code:`nvidia/NVLM-D-72B`, etc.
+  - `nvidia/NVLM-D-72B`, etc.
  -
  - ✅︎
  - ✅︎
-  * - :code:`PaliGemmaForConditionalGeneration`
+* - `PaliGemmaForConditionalGeneration`
  - PaliGemma, PaliGemma 2
-    - T + I\ :sup:`E`
+  - T + I<sup>E</sup>
-    - :code:`google/paligemma-3b-pt-224`, :code:`google/paligemma-3b-mix-224`, :code:`google/paligemma2-3b-ft-docci-448`, etc.
+  - `google/paligemma-3b-pt-224`, `google/paligemma-3b-mix-224`, `google/paligemma2-3b-ft-docci-448`, etc.
  -
  - ✅︎
  -
-  * - :code:`Phi3VForCausalLM`
+* - `Phi3VForCausalLM`
  - Phi-3-Vision, Phi-3.5-Vision
-    - T + I\ :sup:`E+`
+  - T + I<sup>E+</sup>
-    - :code:`microsoft/Phi-3-vision-128k-instruct`, :code:`microsoft/Phi-3.5-vision-instruct` etc.
+  - `microsoft/Phi-3-vision-128k-instruct`, `microsoft/Phi-3.5-vision-instruct` etc.
  -
  - ✅︎
  - ✅︎
-  * - :code:`PixtralForConditionalGeneration`
+* - `PixtralForConditionalGeneration`
  - Pixtral
-    - T + I\ :sup:`+`
+  - T + I<sup>+</sup>
-    - :code:`mistralai/Pixtral-12B-2409`, :code:`mistral-community/pixtral-12b` etc.
+  - `mistralai/Pixtral-12B-2409`, `mistral-community/pixtral-12b` etc.
  -
  - ✅︎
  - ✅︎
-  * - :code:`QWenLMHeadModel`
+* - `QWenLMHeadModel`
  - Qwen-VL
-    - T + I\ :sup:`E+`
+  - T + I<sup>E+</sup>
-    - :code:`Qwen/Qwen-VL`, :code:`Qwen/Qwen-VL-Chat`, etc.
+  - `Qwen/Qwen-VL`, `Qwen/Qwen-VL-Chat`, etc.
  - ✅︎
  - ✅︎
  -
-  * - :code:`Qwen2AudioForConditionalGeneration`
+* - `Qwen2AudioForConditionalGeneration`
  - Qwen2-Audio
-    - T + A\ :sup:`+`
+  - T + A<sup>+</sup>
-    - :code:`Qwen/Qwen2-Audio-7B-Instruct`
+  - `Qwen/Qwen2-Audio-7B-Instruct`
  -
  - ✅︎
  -
-  * - :code:`Qwen2VLForConditionalGeneration`
+* - `Qwen2VLForConditionalGeneration`
  - Qwen2-VL
-    - T + I\ :sup:`E+` + V\ :sup:`E+`
+  - T + I<sup>E+</sup> + V<sup>E+</sup>
-    - :code:`Qwen/QVQ-72B-Preview`, :code:`Qwen/Qwen2-VL-7B-Instruct`, :code:`Qwen/Qwen2-VL-72B-Instruct`, etc.
+  - `Qwen/QVQ-72B-Preview`, `Qwen/Qwen2-VL-7B-Instruct`, `Qwen/Qwen2-VL-72B-Instruct`, etc.
  - ✅︎
  - ✅︎
  -
-  * - :code:`UltravoxModel`
+* - `UltravoxModel`
  - Ultravox
-    - T + A\ :sup:`E+`
+  - T + A<sup>E+</sup>
-    - :code:`fixie-ai/ultravox-v0_3`
+  - `fixie-ai/ultravox-v0_3`
  -
  - ✅︎
  -
 ```
-```{eval-rst}
+<sup>E</sup> Pre-computed embeddings can be inputted for this modality.  
-:sup:`E` Pre-computed embeddings can be inputted for this modality.
+<sup>+</sup> Multiple items can be inputted per text prompt for this modality.
 :sup:`+` Multiple items can be inputted per text prompt for this modality.
 ```
 ````{important}
 To enable multiple multi-modal items per text prompt, you have to set {code}`limit_mm_per_prompt` (offline inference)
@ -787,8 +778,7 @@ To get the best results, you should use pooling models that are specifically tra
 The following table lists those that are tested in vLLM.
-```{eval-rst}
+```{list-table}
 .. list-table::
 :widths: 25 25 15 25 5 5
 :header-rows: 1
@ -796,29 +786,29 @@ The following table lists those that are tested in vLLM.
  - Models
  - Inputs
  - Example HF Models
-    - :ref:`LoRA <lora-adapter>`
+  - [LoRA](#lora-adapter)
-    - :ref:`PP <distributed-serving>`
+  - [PP](#distributed-serving)
-  * - :code:`LlavaNextForConditionalGeneration`
+* - `LlavaNextForConditionalGeneration`
  - LLaVA-NeXT-based
  - T / I
-    - :code:`royokong/e5-v`
+  - `royokong/e5-v`
  -
  - ✅︎
-  * - :code:`Phi3VForCausalLM`
+* - `Phi3VForCausalLM`
  - Phi-3-Vision-based
  - T + I
-    - :code:`TIGER-Lab/VLM2Vec-Full`
+  - `TIGER-Lab/VLM2Vec-Full`
  - 🚧
  - ✅︎
-  * - :code:`Qwen2VLForConditionalGeneration`
+* - `Qwen2VLForConditionalGeneration`
  - Qwen2-VL-based
  - T + I
-    - :code:`MrLight/dse-qwen2-2b-mrl-v1`
+  - `MrLight/dse-qwen2-2b-mrl-v1`
  -
  - ✅︎
 ```
-______________________________________________________________________
+_________________
 # Model Support Policy
--- a/docs/source/quantization/supported_hardware.md
+++ b/docs/source/quantization/supported_hardware.md
@ -4,8 +4,7 @@
 The table below shows the compatibility of various quantization implementations with different hardware platforms in vLLM:
-```{eval-rst}
+```{list-table}
 .. list-table::
 :header-rows: 1
 :widths: 20 8 8 8 8 8 8 8 8 8 8
--- a/docs/source/serving/deploying_with_helm.md
+++ b/docs/source/serving/deploying_with_helm.md
@ -43,8 +43,7 @@ chart **including persistent volumes** and deletes the release.
 ## Values
-```{eval-rst}
+```{list-table}
 .. list-table:: Values
 :widths: 25 25 25 25
 :header-rows: 1