vllm/docs/source/serving/compatibility_matrix.rst
Joe Runde 031a7995f3
[Bugfix][Frontend] Reject guided decoding in multistep mode (#9892)
Signed-off-by: Joe Runde <Joseph.Runde@ibm.com>
2024-11-01 01:09:46 +00:00

428 lines
6.5 KiB
ReStructuredText

.. _compatibility_matrix:
Compatibility Matrix
====================
The tables below show mutually exclusive features and the support on some hardware.
.. note::
Check the '✗' with links to see tracking issue for unsupported feature/hardware combination.
Feature x Feature
-----------------
.. raw:: html
<style>
/* Make smaller to try to improve readability */
td {
font-size: 0.8rem;
text-align: center;
}
th {
text-align: center;
font-size: 0.8rem;
}
</style>
.. list-table::
:header-rows: 1
:widths: auto
* - Feature
- :ref:`CP <chunked-prefill>`
- :ref:`APC <apc>`
- :ref:`LoRA <lora>`
- :abbr:`prmpt adptr (Prompt Adapter)`
- :ref:`SD <spec_decode>`
- CUDA graph
- :abbr:`enc-dec (Encoder-Decoder Models)`
- :abbr:`logP (Logprobs)`
- :abbr:`prmpt logP (Prompt Logprobs)`
- :abbr:`async output (Async Output Processing)`
- multi-step
- :abbr:`MM (Multimodal)`
- best-of
- beam-search
- :abbr:`guided dec (Guided Decoding)`
* - :ref:`CP <chunked-prefill>`
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
* - :ref:`APC <apc>`
- ✅
-
-
-
-
-
-
-
-
-
-
-
-
-
-
* - :ref:`LoRA <lora>`
- `✗ <https://github.com/vllm-project/vllm/pull/9057>`__
- ✅
-
-
-
-
-
-
-
-
-
-
-
-
-
* - :abbr:`prmpt adptr (Prompt Adapter)`
- ✅
- ✅
- ✅
-
-
-
-
-
-
-
-
-
-
-
-
* - :ref:`SD <spec_decode>`
- ✗
- ✅
- ✗
- ✅
-
-
-
-
-
-
-
-
-
-
-
* - CUDA graph
- ✅
- ✅
- ✅
- ✅
- ✅
-
-
-
-
-
-
-
-
-
-
* - :abbr:`enc-dec (Encoder-Decoder Models)`
- ✗
- `✗ <https://github.com/vllm-project/vllm/issues/7366>`__
- ✗
- ✗
- `✗ <https://github.com/vllm-project/vllm/issues/7366>`__
- ✅
-
-
-
-
-
-
-
-
-
* - :abbr:`logP (Logprobs)`
- ✅
- ✅
- ✅
- ✅
- ✅
- ✅
- ✅
-
-
-
-
-
-
-
-
* - :abbr:`prmpt logP (Prompt Logprobs)`
- ✅
- ✅
- ✅
- ✅
- `✗ <https://github.com/vllm-project/vllm/pull/8199>`__
- ✅
- ✅
- ✅
-
-
-
-
-
-
-
* - :abbr:`async output (Async Output Processing)`
- ✅
- ✅
- ✅
- ✅
- ✗
- ✅
- ✗
- ✅
- ✅
-
-
-
-
-
-
* - multi-step
- ✗
- ✅
- ✗
- ✅
- ✗
- ✅
- ✗
- ✅
- `✗ <https://github.com/vllm-project/vllm/issues/8198>`__
- ✅
-
-
-
-
-
* - :abbr:`MM (Multimodal)`
- `✗ <https://github.com/vllm-project/vllm/pull/8346>`__
- `✗ <https://github.com/vllm-project/vllm/pull/8348>`__
- `✗ <https://github.com/vllm-project/vllm/pull/7199>`__
- ?
- ?
- ✅
- ✗
- ✅
- ✅
- ✅
- ?
-
-
-
-
* - best-of
- ✅
- ✅
- ✅
- ✅
- `✗ <https://github.com/vllm-project/vllm/issues/6137>`__
- ✅
- ✅
- ✅
- ✅
- ?
- `✗ <https://github.com/vllm-project/vllm/issues/7968>`__
- ✅
-
-
-
* - beam-search
- ✅
- ✅
- ✅
- ✅
- `✗ <https://github.com/vllm-project/vllm/issues/6137>`__
- ✅
- ✅
- ✅
- ✅
- ?
- `✗ <https://github.com/vllm-project/vllm/issues/7968>`__
- ?
- ✅
-
-
* - :abbr:`guided dec (Guided Decoding)`
- ✅
- ✅
- ?
- ?
- ✅
- ✅
- ?
- ✅
- ✅
- ✅
- `✗ <https://github.com/vllm-project/vllm/issues/9893>`__
- ?
- ✅
- ✅
-
Feature x Hardware
^^^^^^^^^^^^^^^^^^
.. list-table::
:header-rows: 1
:widths: auto
* - Feature
- Volta
- Turing
- Ampere
- Ada
- Hopper
- CPU
- AMD
* - :ref:`CP <chunked-prefill>`
- `✗ <https://github.com/vllm-project/vllm/issues/2729>`__
- ✅
- ✅
- ✅
- ✅
- ✗
- ✅
* - :ref:`APC <apc>`
- `✗ <https://github.com/vllm-project/vllm/issues/3687>`__
- ✅
- ✅
- ✅
- ✅
- ✗
- ✅
* - :ref:`LoRA <lora>`
- ✅
- ✅
- ✅
- ✅
- ✅
- `✗ <https://github.com/vllm-project/vllm/pull/4830>`__
- ✅
* - :abbr:`prmpt adptr (Prompt Adapter)`
- ✅
- ✅
- ✅
- ✅
- ✅
- `✗ <https://github.com/vllm-project/vllm/issues/8475>`__
- ✅
* - :ref:`SD <spec_decode>`
- ✅
- ✅
- ✅
- ✅
- ✅
- ✅
- ✅
* - CUDA graph
- ✅
- ✅
- ✅
- ✅
- ✅
- ✗
- ✅
* - :abbr:`enc-dec (Encoder-Decoder Models)`
- ✅
- ✅
- ✅
- ✅
- ✅
- `✗ <https://github.com/vllm-project/vllm/blob/a84e598e2125960d3b4f716b78863f24ac562947/vllm/worker/cpu_model_runner.py#L125>`__
- ✗
* - :abbr:`logP (Logprobs)`
- ✅
- ✅
- ✅
- ✅
- ✅
- ✅
- ✅
* - :abbr:`prmpt logP (Prompt Logprobs)`
- ✅
- ✅
- ✅
- ✅
- ✅
- ✅
- ✅
* - :abbr:`async output (Async Output Processing)`
- ✅
- ✅
- ✅
- ✅
- ✅
- ✗
- ✗
* - multi-step
- ✅
- ✅
- ✅
- ✅
- ✅
- `✗ <https://github.com/vllm-project/vllm/issues/8477>`__
- ✅
* - :abbr:`MM (Multimodal)`
- ✅
- ✅
- ✅
- ✅
- ✅
- ✅
- ✅
* - best-of
- ✅
- ✅
- ✅
- ✅
- ✅
- ✅
- ✅
* - beam-search
- ✅
- ✅
- ✅
- ✅
- ✅
- ✅
- ✅
* - :abbr:`guided dec (Guided Decoding)`
- ✅
- ✅
- ✅
- ✅
- ✅
- ✅
- ✅