vllm/docs/source/models/supported_models.rst

.. _supported_models:

Supported Models
================

vLLM supports a variety of generative Transformer models in `HuggingFace Transformers <https://huggingface.co/models>`_.
The following is the list of model architectures that are currently supported by vLLM.
Alongside each architecture, we include some popular models that use it.

.. list-table::
  :widths: 25 25 50
  :header-rows: 1

  * - Architecture
    - Models
    - Example HuggingFace Models
  * - :code:`GPT2LMHeadModel`
    - GPT-2
    - :code:`gpt2`, :code:`gpt2-xl`, etc.
  * - :code:`GPTNeoXForCausalLM`
    - GPT-NeoX, Pythia, OpenAssistant, Dolly V2, StableLM
    - :code:`EleutherAI/gpt-neox-20b`, :code:`EleutherAI/pythia-12b`, :code:`OpenAssistant/oasst-sft-4-pythia-12b-epoch-3.5`, :code:`databricks/dolly-v2-12b`, :code:`stabilityai/stablelm-tuned-alpha-7b`, etc.
  * - :code:`LlamaForCausalLM`
    - LLaMA, Vicuna, Alpaca, Koala, Guanaco
    - :code:`openlm-research/open_llama_13b`, :code:`lmsys/vicuna-13b-v1.3`, :code:`young-geng/koala`, :code:`JosephusCheung/Guanaco`, etc.
  * - :code:`OPTForCausalLM`
    - OPT, OPT-IML
    - :code:`facebook/opt-66b`, :code:`facebook/opt-iml-max-30b`, etc.

If your model uses one of the above model architectures, you can seamlessly run your model with vLLM.
Otherwise, please refer to :ref:`Adding a New Model <adding_a_new_model>` for instructions on how to implement support for your model.
Alternatively, you can raise an issue on our `GitHub <https://github.com/WoosukKwon/vllm/issues>`_ project.

.. tip::
    The easiest way to check if your model is supported is to run the program below:

    .. code-block:: python

        from vllm import LLM

        llm = LLM(model=...)  # Name or path of your model
        output = llm.generate("Hello, my name is")
        print(output)

    If vLLM successfully generates text, it indicates that your model is supported.
Document supported models (#127) 2023-06-02 22:35:17 -07:00			`.. _supported_models:`

			`Supported Models`
			`================`

Write README and front page of doc (#147) 2023-06-18 03:19:38 -07:00			vLLM supports a variety of generative Transformer models in `HuggingFace Transformers <https://huggingface.co/models>`_.
Change the name to vLLM (#150) 2023-06-17 03:07:40 -07:00			`The following is the list of model architectures that are currently supported by vLLM.`
Document supported models (#127) 2023-06-02 22:35:17 -07:00			`Alongside each architecture, we include some popular models that use it.`

			`.. list-table::`
Add and list supported models in README (#161) 2023-06-20 10:57:46 +08:00			`:widths: 25 25 50`
Document supported models (#127) 2023-06-02 22:35:17 -07:00			`:header-rows: 1`

			`* - Architecture`
			`- Models`
Add and list supported models in README (#161) 2023-06-20 10:57:46 +08:00			`- Example HuggingFace Models`
Document supported models (#127) 2023-06-02 22:35:17 -07:00			* - :code:`GPT2LMHeadModel`
			`- GPT-2`
Add and list supported models in README (#161) 2023-06-20 10:57:46 +08:00			- :code:`gpt2`, :code:`gpt2-xl`, etc.
Document supported models (#127) 2023-06-02 22:35:17 -07:00			* - :code:`GPTNeoXForCausalLM`
			`- GPT-NeoX, Pythia, OpenAssistant, Dolly V2, StableLM`
Add and list supported models in README (#161) 2023-06-20 10:57:46 +08:00			- :code:`EleutherAI/gpt-neox-20b`, :code:`EleutherAI/pythia-12b`, :code:`OpenAssistant/oasst-sft-4-pythia-12b-epoch-3.5`, :code:`databricks/dolly-v2-12b`, :code:`stabilityai/stablelm-tuned-alpha-7b`, etc.
Document supported models (#127) 2023-06-02 22:35:17 -07:00			* - :code:`LlamaForCausalLM`
Write README and front page of doc (#147) 2023-06-18 03:19:38 -07:00			`- LLaMA, Vicuna, Alpaca, Koala, Guanaco`
Add and list supported models in README (#161) 2023-06-20 10:57:46 +08:00			- :code:`openlm-research/open_llama_13b`, :code:`lmsys/vicuna-13b-v1.3`, :code:`young-geng/koala`, :code:`JosephusCheung/Guanaco`, etc.
Document supported models (#127) 2023-06-02 22:35:17 -07:00			* - :code:`OPTForCausalLM`
			`- OPT, OPT-IML`
Add and list supported models in README (#161) 2023-06-20 10:57:46 +08:00			- :code:`facebook/opt-66b`, :code:`facebook/opt-iml-max-30b`, etc.
Document supported models (#127) 2023-06-02 22:35:17 -07:00
Change the name to vLLM (#150) 2023-06-17 03:07:40 -07:00			`If your model uses one of the above model architectures, you can seamlessly run your model with vLLM.`
Document supported models (#127) 2023-06-02 22:35:17 -07:00			Otherwise, please refer to :ref:`Adding a New Model <adding_a_new_model>` for instructions on how to implement support for your model.
Change the name to vLLM (#150) 2023-06-17 03:07:40 -07:00			Alternatively, you can raise an issue on our `GitHub <https://github.com/WoosukKwon/vllm/issues>`_ project.
Document supported models (#127) 2023-06-02 22:35:17 -07:00
			`.. tip::`
			`The easiest way to check if your model is supported is to run the program below:`

			`.. code-block:: python`

Change the name to vLLM (#150) 2023-06-17 03:07:40 -07:00			`from vllm import LLM`
Document supported models (#127) 2023-06-02 22:35:17 -07:00
			`llm = LLM(model=...) # Name or path of your model`
			`output = llm.generate("Hello, my name is")`
			`print(output)`

Change the name to vLLM (#150) 2023-06-17 03:07:40 -07:00			`If vLLM successfully generates text, it indicates that your model is supported.`