2024-12-23 17:35:38 -05:00
|
|
|
(compatibility-matrix)=
|
|
|
|
|
|
|
|
# Compatibility Matrix
|
|
|
|
|
|
|
|
The tables below show mutually exclusive features and the support on some hardware.
|
|
|
|
|
2025-02-18 10:52:39 +00:00
|
|
|
The symbols used have the following meanings:
|
|
|
|
|
|
|
|
- ✅ = Full compatibility
|
|
|
|
- 🟠 = Partial compatibility
|
|
|
|
- ❌ = No compatibility
|
|
|
|
|
2025-01-29 03:38:29 +00:00
|
|
|
:::{note}
|
2025-02-18 10:52:39 +00:00
|
|
|
Check the ❌ or 🟠 with links to see tracking issue for unsupported feature/hardware combination.
|
2025-01-29 03:38:29 +00:00
|
|
|
:::
|
2024-12-23 17:35:38 -05:00
|
|
|
|
|
|
|
## Feature x Feature
|
|
|
|
|
2025-01-29 03:38:29 +00:00
|
|
|
:::{raw} html
|
2024-12-23 17:35:38 -05:00
|
|
|
<style>
|
|
|
|
/* Make smaller to try to improve readability */
|
|
|
|
td {
|
|
|
|
font-size: 0.8rem;
|
|
|
|
text-align: center;
|
|
|
|
}
|
|
|
|
|
|
|
|
th {
|
|
|
|
text-align: center;
|
|
|
|
font-size: 0.8rem;
|
|
|
|
}
|
|
|
|
</style>
|
2025-01-29 03:38:29 +00:00
|
|
|
:::
|
2024-12-23 17:35:38 -05:00
|
|
|
|
2025-01-29 03:38:29 +00:00
|
|
|
:::{list-table}
|
|
|
|
:header-rows: 1
|
|
|
|
:stub-columns: 1
|
|
|
|
:widths: auto
|
2025-02-18 10:52:39 +00:00
|
|
|
:class: vertical-table-header
|
2024-12-23 17:35:38 -05:00
|
|
|
|
2025-01-29 03:38:29 +00:00
|
|
|
- * Feature
|
|
|
|
* [CP](#chunked-prefill)
|
|
|
|
* [APC](#automatic-prefix-caching)
|
|
|
|
* [LoRA](#lora-adapter)
|
|
|
|
* <abbr title="Prompt Adapter">prmpt adptr</abbr>
|
|
|
|
* [SD](#spec_decode)
|
|
|
|
* CUDA graph
|
|
|
|
* <abbr title="Pooling Models">pooling</abbr>
|
|
|
|
* <abbr title="Encoder-Decoder Models">enc-dec</abbr>
|
|
|
|
* <abbr title="Logprobs">logP</abbr>
|
|
|
|
* <abbr title="Prompt Logprobs">prmpt logP</abbr>
|
|
|
|
* <abbr title="Async Output Processing">async output</abbr>
|
|
|
|
* multi-step
|
|
|
|
* <abbr title="Multimodal Inputs">mm</abbr>
|
|
|
|
* best-of
|
|
|
|
* beam-search
|
|
|
|
* <abbr title="Guided Decoding">guided dec</abbr>
|
|
|
|
- * [CP](#chunked-prefill)
|
2025-02-18 10:52:39 +00:00
|
|
|
* ✅
|
2025-01-29 03:38:29 +00:00
|
|
|
*
|
|
|
|
*
|
|
|
|
*
|
|
|
|
*
|
|
|
|
*
|
|
|
|
*
|
|
|
|
*
|
|
|
|
*
|
|
|
|
*
|
|
|
|
*
|
|
|
|
*
|
|
|
|
*
|
|
|
|
*
|
|
|
|
*
|
|
|
|
*
|
|
|
|
- * [APC](#automatic-prefix-caching)
|
|
|
|
* ✅
|
2025-02-18 10:52:39 +00:00
|
|
|
* ✅
|
2025-01-29 03:38:29 +00:00
|
|
|
*
|
|
|
|
*
|
|
|
|
*
|
|
|
|
*
|
|
|
|
*
|
|
|
|
*
|
|
|
|
*
|
|
|
|
*
|
|
|
|
*
|
|
|
|
*
|
|
|
|
*
|
|
|
|
*
|
|
|
|
*
|
|
|
|
*
|
|
|
|
- * [LoRA](#lora-adapter)
|
|
|
|
* ✅
|
2025-02-18 10:52:39 +00:00
|
|
|
* ✅
|
|
|
|
* ✅
|
2025-01-29 03:38:29 +00:00
|
|
|
*
|
|
|
|
*
|
|
|
|
*
|
|
|
|
*
|
|
|
|
*
|
|
|
|
*
|
|
|
|
*
|
|
|
|
*
|
|
|
|
*
|
|
|
|
*
|
|
|
|
*
|
|
|
|
*
|
|
|
|
*
|
|
|
|
- * <abbr title="Prompt Adapter">prmpt adptr</abbr>
|
|
|
|
* ✅
|
|
|
|
* ✅
|
|
|
|
* ✅
|
2025-02-18 10:52:39 +00:00
|
|
|
* ✅
|
2025-01-29 03:38:29 +00:00
|
|
|
*
|
|
|
|
*
|
|
|
|
*
|
|
|
|
*
|
|
|
|
*
|
|
|
|
*
|
|
|
|
*
|
|
|
|
*
|
|
|
|
*
|
|
|
|
*
|
|
|
|
*
|
|
|
|
*
|
|
|
|
- * [SD](#spec_decode)
|
|
|
|
* ✅
|
|
|
|
* ✅
|
2025-02-18 10:52:39 +00:00
|
|
|
* ❌
|
|
|
|
* ✅
|
2025-01-29 03:38:29 +00:00
|
|
|
* ✅
|
|
|
|
*
|
|
|
|
*
|
|
|
|
*
|
|
|
|
*
|
|
|
|
*
|
|
|
|
*
|
|
|
|
*
|
|
|
|
*
|
|
|
|
*
|
|
|
|
*
|
|
|
|
*
|
|
|
|
- * CUDA graph
|
|
|
|
* ✅
|
|
|
|
* ✅
|
|
|
|
* ✅
|
|
|
|
* ✅
|
|
|
|
* ✅
|
2025-02-18 10:52:39 +00:00
|
|
|
* ✅
|
2025-01-29 03:38:29 +00:00
|
|
|
*
|
|
|
|
*
|
|
|
|
*
|
|
|
|
*
|
|
|
|
*
|
|
|
|
*
|
|
|
|
*
|
|
|
|
*
|
|
|
|
*
|
|
|
|
*
|
|
|
|
- * <abbr title="Pooling Models">pooling</abbr>
|
2025-02-18 10:52:39 +00:00
|
|
|
* ❌
|
|
|
|
* ❌
|
|
|
|
* ❌
|
|
|
|
* ❌
|
|
|
|
* ❌
|
|
|
|
* ❌
|
|
|
|
* ✅
|
2025-01-29 03:38:29 +00:00
|
|
|
*
|
|
|
|
*
|
|
|
|
*
|
|
|
|
*
|
|
|
|
*
|
|
|
|
*
|
|
|
|
*
|
|
|
|
*
|
|
|
|
*
|
|
|
|
- * <abbr title="Encoder-Decoder Models">enc-dec</abbr>
|
2025-02-18 10:52:39 +00:00
|
|
|
* ❌
|
|
|
|
* [❌](gh-issue:7366)
|
|
|
|
* ❌
|
|
|
|
* ❌
|
|
|
|
* [❌](gh-issue:7366)
|
|
|
|
* ✅
|
2025-01-29 03:38:29 +00:00
|
|
|
* ✅
|
|
|
|
* ✅
|
|
|
|
*
|
|
|
|
*
|
|
|
|
*
|
|
|
|
*
|
|
|
|
*
|
|
|
|
*
|
|
|
|
*
|
|
|
|
*
|
|
|
|
- * <abbr title="Logprobs">logP</abbr>
|
|
|
|
* ✅
|
|
|
|
* ✅
|
|
|
|
* ✅
|
|
|
|
* ✅
|
|
|
|
* ✅
|
|
|
|
* ✅
|
2025-02-18 10:52:39 +00:00
|
|
|
* ❌
|
|
|
|
* ✅
|
2025-01-29 03:38:29 +00:00
|
|
|
* ✅
|
|
|
|
*
|
|
|
|
*
|
|
|
|
*
|
|
|
|
*
|
|
|
|
*
|
|
|
|
*
|
|
|
|
*
|
|
|
|
- * <abbr title="Prompt Logprobs">prmpt logP</abbr>
|
|
|
|
* ✅
|
|
|
|
* ✅
|
|
|
|
* ✅
|
|
|
|
* ✅
|
|
|
|
* ✅
|
2025-02-18 10:52:39 +00:00
|
|
|
* ✅
|
|
|
|
* ❌
|
|
|
|
* ✅
|
2025-01-29 03:38:29 +00:00
|
|
|
* ✅
|
|
|
|
* ✅
|
|
|
|
*
|
|
|
|
*
|
|
|
|
*
|
|
|
|
*
|
|
|
|
*
|
|
|
|
*
|
|
|
|
- * <abbr title="Async Output Processing">async output</abbr>
|
|
|
|
* ✅
|
|
|
|
* ✅
|
|
|
|
* ✅
|
|
|
|
* ✅
|
2025-02-18 10:52:39 +00:00
|
|
|
* ❌
|
|
|
|
* ✅
|
|
|
|
* ❌
|
|
|
|
* ❌
|
2025-01-29 03:38:29 +00:00
|
|
|
* ✅
|
|
|
|
* ✅
|
|
|
|
* ✅
|
|
|
|
*
|
|
|
|
*
|
|
|
|
*
|
|
|
|
*
|
|
|
|
*
|
|
|
|
- * multi-step
|
2025-02-18 10:52:39 +00:00
|
|
|
* ❌
|
2025-01-29 03:38:29 +00:00
|
|
|
* ✅
|
2025-02-18 10:52:39 +00:00
|
|
|
* ❌
|
|
|
|
* ✅
|
|
|
|
* ❌
|
|
|
|
* ✅
|
|
|
|
* ❌
|
|
|
|
* ❌
|
2025-01-29 03:38:29 +00:00
|
|
|
* ✅
|
|
|
|
* ✅
|
|
|
|
* ✅
|
|
|
|
* ✅
|
|
|
|
*
|
|
|
|
*
|
|
|
|
*
|
|
|
|
*
|
|
|
|
- * <abbr title="Multimodal Inputs">mm</abbr>
|
|
|
|
* ✅
|
2025-02-18 10:52:39 +00:00
|
|
|
* [🟠](gh-pr:8348)
|
|
|
|
* [🟠](gh-pr:4194)
|
|
|
|
* ❔
|
|
|
|
* ❔
|
2025-01-29 03:38:29 +00:00
|
|
|
* ✅
|
|
|
|
* ✅
|
|
|
|
* ✅
|
|
|
|
* ✅
|
|
|
|
* ✅
|
|
|
|
* ✅
|
2025-02-18 10:52:39 +00:00
|
|
|
* ❔
|
|
|
|
* ✅
|
2025-01-29 03:38:29 +00:00
|
|
|
*
|
|
|
|
*
|
|
|
|
*
|
|
|
|
- * best-of
|
|
|
|
* ✅
|
|
|
|
* ✅
|
|
|
|
* ✅
|
|
|
|
* ✅
|
2025-02-18 10:52:39 +00:00
|
|
|
* [❌](gh-issue:6137)
|
2025-01-29 03:38:29 +00:00
|
|
|
* ✅
|
2025-02-18 10:52:39 +00:00
|
|
|
* ❌
|
2025-01-29 03:38:29 +00:00
|
|
|
* ✅
|
|
|
|
* ✅
|
|
|
|
* ✅
|
2025-02-18 10:52:39 +00:00
|
|
|
* ❔
|
|
|
|
* [❌](gh-issue:7968)
|
|
|
|
* ✅
|
2025-01-29 03:38:29 +00:00
|
|
|
* ✅
|
|
|
|
*
|
|
|
|
*
|
|
|
|
- * beam-search
|
|
|
|
* ✅
|
|
|
|
* ✅
|
|
|
|
* ✅
|
|
|
|
* ✅
|
2025-02-18 10:52:39 +00:00
|
|
|
* [❌](gh-issue:6137)
|
2025-01-29 03:38:29 +00:00
|
|
|
* ✅
|
2025-02-18 10:52:39 +00:00
|
|
|
* ❌
|
2025-01-29 03:38:29 +00:00
|
|
|
* ✅
|
|
|
|
* ✅
|
|
|
|
* ✅
|
2025-02-18 10:52:39 +00:00
|
|
|
* ❔
|
|
|
|
* [❌](gh-issue:7968)
|
|
|
|
* ❔
|
|
|
|
* ✅
|
2025-01-29 03:38:29 +00:00
|
|
|
* ✅
|
|
|
|
*
|
|
|
|
- * <abbr title="Guided Decoding">guided dec</abbr>
|
|
|
|
* ✅
|
|
|
|
* ✅
|
2025-02-18 10:52:39 +00:00
|
|
|
* ❔
|
|
|
|
* ❔
|
|
|
|
* [❌](gh-issue:11484)
|
2025-01-29 03:38:29 +00:00
|
|
|
* ✅
|
2025-02-18 10:52:39 +00:00
|
|
|
* ❌
|
|
|
|
* ❔
|
2025-01-29 03:38:29 +00:00
|
|
|
* ✅
|
|
|
|
* ✅
|
|
|
|
* ✅
|
2025-02-18 10:52:39 +00:00
|
|
|
* [❌](gh-issue:9893)
|
|
|
|
* ❔
|
|
|
|
* ✅
|
2025-01-29 03:38:29 +00:00
|
|
|
* ✅
|
|
|
|
* ✅
|
|
|
|
:::
|
2024-12-23 17:35:38 -05:00
|
|
|
|
2025-01-13 12:27:36 +00:00
|
|
|
(feature-x-hardware)=
|
|
|
|
|
|
|
|
## Feature x Hardware
|
2024-12-23 17:35:38 -05:00
|
|
|
|
2025-01-29 03:38:29 +00:00
|
|
|
:::{list-table}
|
|
|
|
:header-rows: 1
|
|
|
|
:stub-columns: 1
|
|
|
|
:widths: auto
|
2024-12-23 17:35:38 -05:00
|
|
|
|
2025-01-29 03:38:29 +00:00
|
|
|
- * Feature
|
|
|
|
* Volta
|
|
|
|
* Turing
|
|
|
|
* Ampere
|
|
|
|
* Ada
|
|
|
|
* Hopper
|
|
|
|
* CPU
|
|
|
|
* AMD
|
|
|
|
- * [CP](#chunked-prefill)
|
2025-02-18 10:52:39 +00:00
|
|
|
* [❌](gh-issue:2729)
|
2025-01-29 03:38:29 +00:00
|
|
|
* ✅
|
|
|
|
* ✅
|
|
|
|
* ✅
|
|
|
|
* ✅
|
|
|
|
* ✅
|
|
|
|
* ✅
|
|
|
|
- * [APC](#automatic-prefix-caching)
|
2025-02-18 10:52:39 +00:00
|
|
|
* [❌](gh-issue:3687)
|
2025-01-29 03:38:29 +00:00
|
|
|
* ✅
|
|
|
|
* ✅
|
|
|
|
* ✅
|
|
|
|
* ✅
|
|
|
|
* ✅
|
|
|
|
* ✅
|
|
|
|
- * [LoRA](#lora-adapter)
|
|
|
|
* ✅
|
|
|
|
* ✅
|
|
|
|
* ✅
|
|
|
|
* ✅
|
|
|
|
* ✅
|
|
|
|
* ✅
|
|
|
|
* ✅
|
|
|
|
- * <abbr title="Prompt Adapter">prmpt adptr</abbr>
|
|
|
|
* ✅
|
|
|
|
* ✅
|
|
|
|
* ✅
|
|
|
|
* ✅
|
|
|
|
* ✅
|
2025-02-18 10:52:39 +00:00
|
|
|
* [❌](gh-issue:8475)
|
2025-01-29 03:38:29 +00:00
|
|
|
* ✅
|
|
|
|
- * [SD](#spec_decode)
|
|
|
|
* ✅
|
|
|
|
* ✅
|
|
|
|
* ✅
|
|
|
|
* ✅
|
|
|
|
* ✅
|
|
|
|
* ✅
|
|
|
|
* ✅
|
|
|
|
- * CUDA graph
|
|
|
|
* ✅
|
|
|
|
* ✅
|
|
|
|
* ✅
|
|
|
|
* ✅
|
|
|
|
* ✅
|
2025-02-18 10:52:39 +00:00
|
|
|
* ❌
|
2025-01-29 03:38:29 +00:00
|
|
|
* ✅
|
|
|
|
- * <abbr title="Pooling Models">pooling</abbr>
|
|
|
|
* ✅
|
|
|
|
* ✅
|
|
|
|
* ✅
|
|
|
|
* ✅
|
|
|
|
* ✅
|
|
|
|
* ✅
|
2025-02-18 10:52:39 +00:00
|
|
|
* ❔
|
2025-01-29 03:38:29 +00:00
|
|
|
- * <abbr title="Encoder-Decoder Models">enc-dec</abbr>
|
|
|
|
* ✅
|
|
|
|
* ✅
|
|
|
|
* ✅
|
|
|
|
* ✅
|
|
|
|
* ✅
|
|
|
|
* ✅
|
2025-02-18 10:52:39 +00:00
|
|
|
* ❌
|
2025-01-29 03:38:29 +00:00
|
|
|
- * <abbr title="Multimodal Inputs">mm</abbr>
|
|
|
|
* ✅
|
|
|
|
* ✅
|
|
|
|
* ✅
|
|
|
|
* ✅
|
|
|
|
* ✅
|
|
|
|
* ✅
|
|
|
|
* ✅
|
|
|
|
- * <abbr title="Logprobs">logP</abbr>
|
|
|
|
* ✅
|
|
|
|
* ✅
|
|
|
|
* ✅
|
|
|
|
* ✅
|
|
|
|
* ✅
|
|
|
|
* ✅
|
|
|
|
* ✅
|
|
|
|
- * <abbr title="Prompt Logprobs">prmpt logP</abbr>
|
|
|
|
* ✅
|
|
|
|
* ✅
|
|
|
|
* ✅
|
|
|
|
* ✅
|
|
|
|
* ✅
|
|
|
|
* ✅
|
|
|
|
* ✅
|
|
|
|
- * <abbr title="Async Output Processing">async output</abbr>
|
|
|
|
* ✅
|
|
|
|
* ✅
|
|
|
|
* ✅
|
|
|
|
* ✅
|
|
|
|
* ✅
|
2025-02-18 10:52:39 +00:00
|
|
|
* ❌
|
|
|
|
* ❌
|
2025-01-29 03:38:29 +00:00
|
|
|
- * multi-step
|
|
|
|
* ✅
|
|
|
|
* ✅
|
|
|
|
* ✅
|
|
|
|
* ✅
|
|
|
|
* ✅
|
2025-02-18 10:52:39 +00:00
|
|
|
* [❌](gh-issue:8477)
|
2025-01-29 03:38:29 +00:00
|
|
|
* ✅
|
|
|
|
- * best-of
|
|
|
|
* ✅
|
|
|
|
* ✅
|
|
|
|
* ✅
|
|
|
|
* ✅
|
|
|
|
* ✅
|
|
|
|
* ✅
|
|
|
|
* ✅
|
|
|
|
- * beam-search
|
|
|
|
* ✅
|
|
|
|
* ✅
|
|
|
|
* ✅
|
|
|
|
* ✅
|
|
|
|
* ✅
|
|
|
|
* ✅
|
|
|
|
* ✅
|
|
|
|
- * <abbr title="Guided Decoding">guided dec</abbr>
|
|
|
|
* ✅
|
|
|
|
* ✅
|
|
|
|
* ✅
|
|
|
|
* ✅
|
|
|
|
* ✅
|
|
|
|
* ✅
|
|
|
|
* ✅
|
|
|
|
:::
|