vllm/docs/source/models/supported_models.rst

.. _supported_models:

Supported Models
================

vLLM supports a variety of generative Transformer models in `HuggingFace Transformers <https://huggingface.co/models>`_.
The following is the list of model architectures that are currently supported by vLLM.
Alongside each architecture, we include some popular models that use it.

.. list-table::
  :widths: 25 25 50
  :header-rows: 1

  * - Architecture
    - Models
    - Example HuggingFace Models
  * - :code:`AquilaForCausalLM`
    - Aquila
    - :code:`BAAI/Aquila-7B`, :code:`BAAI/AquilaChat-7B`, etc.
  * - :code:`BaiChuanForCausalLM`
    - Baichuan
    - :code:`baichuan-inc/Baichuan-7B`, :code:`baichuan-inc/Baichuan-13B-Chat`, etc.
  * - :code:`BloomForCausalLM`
    - BLOOM, BLOOMZ, BLOOMChat
    - :code:`bigscience/bloom`, :code:`bigscience/bloomz`, etc.
  * - :code:`FalconForCausalLM`
    - Falcon
    - :code:`tiiuae/falcon-7b``, :code:`tiiuae/falcon-40b`, :code:`tiiuae/falcon-rw-7b`, etc.
  * - :code:`GPT2LMHeadModel`
    - GPT-2
    - :code:`gpt2`, :code:`gpt2-xl`, etc.
  * - :code:`GPTBigCodeForCausalLM`
    - StarCoder, SantaCoder, WizardCoder
    - :code:`bigcode/starcoder`, :code:`bigcode/gpt_bigcode-santacoder`, :code:`WizardLM/WizardCoder-15B-V1.0`, etc.
  * - :code:`GPTJForCausalLM`
    - GPT-J
    - :code:`EleutherAI/gpt-j-6b`, :code:`nomic-ai/gpt4all-j`, etc.
  * - :code:`GPTNeoXForCausalLM`
    - GPT-NeoX, Pythia, OpenAssistant, Dolly V2, StableLM
    - :code:`EleutherAI/gpt-neox-20b`, :code:`EleutherAI/pythia-12b`, :code:`OpenAssistant/oasst-sft-4-pythia-12b-epoch-3.5`, :code:`databricks/dolly-v2-12b`, :code:`stabilityai/stablelm-tuned-alpha-7b`, etc.
  * - :code:`InternLMForCausalLM`
    - InternLM
    - :code:`internlm/internlm-7b`, :code:`internlm/internlm-chat-7b`, etc.
  * - :code:`LlamaForCausalLM`
    - LLaMA, LLaMA-2, Vicuna, Alpaca, Koala, Guanaco
    - :code:`meta-llama/Llama-2-13b-hf`, :code:`meta-llama/Llama-2-70b-hf`, :code:`openlm-research/open_llama_13b`, :code:`lmsys/vicuna-13b-v1.3`, :code:`young-geng/koala`, etc.
  * - :code:`MPTForCausalLM`
    - MPT, MPT-Instruct, MPT-Chat, MPT-StoryWriter
    - :code:`mosaicml/mpt-7b`, :code:`mosaicml/mpt-7b-storywriter`, :code:`mosaicml/mpt-30b`, etc.
  * - :code:`OPTForCausalLM`
    - OPT, OPT-IML
    - :code:`facebook/opt-66b`, :code:`facebook/opt-iml-max-30b`, etc.
  * - :code:`QWenLMHeadModel`
    - Qwen
    - :code:`Qwen/Qwen-7B`, :code:`Qwen/Qwen-7B-Chat`, etc.

If your model uses one of the above model architectures, you can seamlessly run your model with vLLM.
Otherwise, please refer to :ref:`Adding a New Model <adding_a_new_model>` for instructions on how to implement support for your model.
Alternatively, you can raise an issue on our `GitHub <https://github.com/vllm-project/vllm/issues>`_ project.

.. tip::
    The easiest way to check if your model is supported is to run the program below:

    .. code-block:: python

        from vllm import LLM

        llm = LLM(model=...)  # Name or path of your model
        output = llm.generate("Hello, my name is")
        print(output)

    If vLLM successfully generates text, it indicates that your model is supported.
Document supported models (#127) 2023-06-02 22:35:17 -07:00			`.. _supported_models:`

			`Supported Models`
			`================`

Write README and front page of doc (#147) 2023-06-18 03:19:38 -07:00			vLLM supports a variety of generative Transformer models in `HuggingFace Transformers <https://huggingface.co/models>`_.
Change the name to vLLM (#150) 2023-06-17 03:07:40 -07:00			`The following is the list of model architectures that are currently supported by vLLM.`
Document supported models (#127) 2023-06-02 22:35:17 -07:00			`Alongside each architecture, we include some popular models that use it.`

			`.. list-table::`
Add and list supported models in README (#161) 2023-06-20 10:57:46 +08:00			`:widths: 25 25 50`
Document supported models (#127) 2023-06-02 22:35:17 -07:00			`:header-rows: 1`

			`* - Architecture`
			`- Models`
Add and list supported models in README (#161) 2023-06-20 10:57:46 +08:00			`- Example HuggingFace Models`
Update Supported Model List (#825) 2023-08-22 11:51:44 -07:00			* - :code:`AquilaForCausalLM`
[Docs] Minor fixes in supported models (#920) * Minor fix in supported models * Add another small fix for Aquila model --------- Co-authored-by: Zhuohan Li <zhuohan123@gmail.com> 2023-09-01 08:28:39 +09:00			`- Aquila`
Update Supported Model List (#825) 2023-08-22 11:51:44 -07:00			- :code:`BAAI/Aquila-7B`, :code:`BAAI/AquilaChat-7B`, etc.
Add Baichuan-7B to README (#494) 2023-07-25 15:25:12 -07:00			* - :code:`BaiChuanForCausalLM`
[Doc] Add Baichuan 13B to supported models (#656) 2023-08-02 16:45:12 -07:00			`- Baichuan`
Fix baichuan doc style (#748) 2023-08-14 11:57:31 +08:00			- :code:`baichuan-inc/Baichuan-7B`, :code:`baichuan-inc/Baichuan-13B-Chat`, etc.
Add support for BLOOM (#331) 2023-07-03 13:12:35 -07:00			* - :code:`BloomForCausalLM`
			`- BLOOM, BLOOMZ, BLOOMChat`
			- :code:`bigscience/bloom`, :code:`bigscience/bloomz`, etc.
Add Falcon support (new) (#592) 2023-08-02 14:04:39 -07:00			* - :code:`FalconForCausalLM`
			`- Falcon`
			- :code:`tiiuae/falcon-7b``, :code:`tiiuae/falcon-40b`, :code:`tiiuae/falcon-rw-7b`, etc.
Document supported models (#127) 2023-06-02 22:35:17 -07:00			* - :code:`GPT2LMHeadModel`
			`- GPT-2`
Add and list supported models in README (#161) 2023-06-20 10:57:46 +08:00			- :code:`gpt2`, :code:`gpt2-xl`, etc.
[Docs] Add GPTBigCode to supported models (#213) 2023-06-22 15:05:11 -07:00			* - :code:`GPTBigCodeForCausalLM`
			`- StarCoder, SantaCoder, WizardCoder`
			- :code:`bigcode/starcoder`, :code:`bigcode/gpt_bigcode-santacoder`, :code:`WizardLM/WizardCoder-15B-V1.0`, etc.
[Model] Add support for GPT-J (#226) Co-authored-by: woWoosuk Kwon <woosuk.kwon@berkeley.edu> 2023-07-08 20:55:16 -04:00			* - :code:`GPTJForCausalLM`
			`- GPT-J`
			- :code:`EleutherAI/gpt-j-6b`, :code:`nomic-ai/gpt4all-j`, etc.
Document supported models (#127) 2023-06-02 22:35:17 -07:00			* - :code:`GPTNeoXForCausalLM`
			`- GPT-NeoX, Pythia, OpenAssistant, Dolly V2, StableLM`
Add and list supported models in README (#161) 2023-06-20 10:57:46 +08:00			- :code:`EleutherAI/gpt-neox-20b`, :code:`EleutherAI/pythia-12b`, :code:`OpenAssistant/oasst-sft-4-pythia-12b-epoch-3.5`, :code:`databricks/dolly-v2-12b`, :code:`stabilityai/stablelm-tuned-alpha-7b`, etc.
Update Supported Model List (#825) 2023-08-22 11:51:44 -07:00			* - :code:`InternLMForCausalLM`
			`- InternLM`
			- :code:`internlm/internlm-7b`, :code:`internlm/internlm-chat-7b`, etc.
Document supported models (#127) 2023-06-02 22:35:17 -07:00			* - :code:`LlamaForCausalLM`
Add support for LLaMA-2 (#505) 2023-07-20 11:38:27 -07:00			`- LLaMA, LLaMA-2, Vicuna, Alpaca, Koala, Guanaco`
[Docs] Minor fixes in supported models (#920) * Minor fix in supported models * Add another small fix for Aquila model --------- Co-authored-by: Zhuohan Li <zhuohan123@gmail.com> 2023-09-01 08:28:39 +09:00			- :code:`meta-llama/Llama-2-13b-hf`, :code:`meta-llama/Llama-2-70b-hf`, :code:`openlm-research/open_llama_13b`, :code:`lmsys/vicuna-13b-v1.3`, :code:`young-geng/koala`, etc.
[Docs] Fix typo (#346) 2023-07-03 16:51:47 -07:00			* - :code:`MPTForCausalLM`
[Model] Add support for MPT (#334) 2023-07-03 16:47:53 -07:00			`- MPT, MPT-Instruct, MPT-Chat, MPT-StoryWriter`
			- :code:`mosaicml/mpt-7b`, :code:`mosaicml/mpt-7b-storywriter`, :code:`mosaicml/mpt-30b`, etc.
Document supported models (#127) 2023-06-02 22:35:17 -07:00			* - :code:`OPTForCausalLM`
			`- OPT, OPT-IML`
Add and list supported models in README (#161) 2023-06-20 10:57:46 +08:00			- :code:`facebook/opt-66b`, :code:`facebook/opt-iml-max-30b`, etc.
[Docs] Minor fixes in supported models (#920) * Minor fix in supported models * Add another small fix for Aquila model --------- Co-authored-by: Zhuohan Li <zhuohan123@gmail.com> 2023-09-01 08:28:39 +09:00			* - :code:`QWenLMHeadModel`
Update Supported Model List (#825) 2023-08-22 11:51:44 -07:00			`- Qwen`
			- :code:`Qwen/Qwen-7B`, :code:`Qwen/Qwen-7B-Chat`, etc.
Document supported models (#127) 2023-06-02 22:35:17 -07:00
Change the name to vLLM (#150) 2023-06-17 03:07:40 -07:00			`If your model uses one of the above model architectures, you can seamlessly run your model with vLLM.`
Document supported models (#127) 2023-06-02 22:35:17 -07:00			Otherwise, please refer to :ref:`Adding a New Model <adding_a_new_model>` for instructions on how to implement support for your model.
Fix repo & documentation URLs (#163) 2023-06-19 20:03:40 -07:00			Alternatively, you can raise an issue on our `GitHub <https://github.com/vllm-project/vllm/issues>`_ project.
Document supported models (#127) 2023-06-02 22:35:17 -07:00
			`.. tip::`
			`The easiest way to check if your model is supported is to run the program below:`

			`.. code-block:: python`

Change the name to vLLM (#150) 2023-06-17 03:07:40 -07:00			`from vllm import LLM`
Document supported models (#127) 2023-06-02 22:35:17 -07:00
			`llm = LLM(model=...) # Name or path of your model`
			`output = llm.generate("Hello, my name is")`
			`print(output)`

Change the name to vLLM (#150) 2023-06-17 03:07:40 -07:00			`If vLLM successfully generates text, it indicates that your model is supported.`