2024-12-23 17:35:38 -05:00
|
|
|
(compatibility-matrix)=
|
|
|
|
|
|
|
|
# Compatibility Matrix
|
|
|
|
|
|
|
|
The tables below show mutually exclusive features and the support on some hardware.
|
|
|
|
|
|
|
|
```{note}
|
|
|
|
Check the '✗' with links to see tracking issue for unsupported feature/hardware combination.
|
|
|
|
```
|
|
|
|
|
|
|
|
## Feature x Feature
|
|
|
|
|
|
|
|
```{raw} html
|
|
|
|
<style>
|
|
|
|
/* Make smaller to try to improve readability */
|
|
|
|
td {
|
|
|
|
font-size: 0.8rem;
|
|
|
|
text-align: center;
|
|
|
|
}
|
|
|
|
|
|
|
|
th {
|
|
|
|
text-align: center;
|
|
|
|
font-size: 0.8rem;
|
|
|
|
}
|
|
|
|
</style>
|
|
|
|
```
|
|
|
|
|
|
|
|
```{list-table}
|
|
|
|
:header-rows: 1
|
|
|
|
:stub-columns: 1
|
|
|
|
:widths: auto
|
|
|
|
|
|
|
|
* - Feature
|
|
|
|
- [CP](#chunked-prefill)
|
|
|
|
- [APC](#apc)
|
|
|
|
- [LoRA](#lora-adapter)
|
|
|
|
- <abbr title="Prompt Adapter">prmpt adptr</abbr>
|
|
|
|
- [SD](#spec_decode)
|
|
|
|
- CUDA graph
|
|
|
|
- <abbr title="Pooling Models">pooling</abbr>
|
|
|
|
- <abbr title="Encoder-Decoder Models">enc-dec</abbr>
|
|
|
|
- <abbr title="Logprobs">logP</abbr>
|
|
|
|
- <abbr title="Prompt Logprobs">prmpt logP</abbr>
|
|
|
|
- <abbr title="Async Output Processing">async output</abbr>
|
|
|
|
- multi-step
|
|
|
|
- <abbr title="Multimodal Inputs">mm</abbr>
|
|
|
|
- best-of
|
|
|
|
- beam-search
|
|
|
|
- <abbr title="Guided Decoding">guided dec</abbr>
|
|
|
|
* - [CP](#chunked-prefill)
|
|
|
|
-
|
|
|
|
-
|
|
|
|
-
|
|
|
|
-
|
|
|
|
-
|
|
|
|
-
|
|
|
|
-
|
|
|
|
-
|
|
|
|
-
|
|
|
|
-
|
|
|
|
-
|
|
|
|
-
|
|
|
|
-
|
|
|
|
-
|
|
|
|
-
|
|
|
|
-
|
|
|
|
* - [APC](#apc)
|
|
|
|
- ✅
|
|
|
|
-
|
|
|
|
-
|
|
|
|
-
|
|
|
|
-
|
|
|
|
-
|
|
|
|
-
|
|
|
|
-
|
|
|
|
-
|
|
|
|
-
|
|
|
|
-
|
|
|
|
-
|
|
|
|
-
|
|
|
|
-
|
|
|
|
-
|
|
|
|
-
|
|
|
|
* - [LoRA](#lora-adapter)
|
2024-12-26 06:49:26 +08:00
|
|
|
- [✗](gh-pr:9057)
|
2024-12-23 17:35:38 -05:00
|
|
|
- ✅
|
|
|
|
-
|
|
|
|
-
|
|
|
|
-
|
|
|
|
-
|
|
|
|
-
|
|
|
|
-
|
|
|
|
-
|
|
|
|
-
|
|
|
|
-
|
|
|
|
-
|
|
|
|
-
|
|
|
|
-
|
|
|
|
-
|
|
|
|
-
|
|
|
|
* - <abbr title="Prompt Adapter">prmpt adptr</abbr>
|
|
|
|
- ✅
|
|
|
|
- ✅
|
|
|
|
- ✅
|
|
|
|
-
|
|
|
|
-
|
|
|
|
-
|
|
|
|
-
|
|
|
|
-
|
|
|
|
-
|
|
|
|
-
|
|
|
|
-
|
|
|
|
-
|
|
|
|
-
|
|
|
|
-
|
|
|
|
-
|
|
|
|
-
|
|
|
|
* - [SD](#spec_decode)
|
|
|
|
- ✅
|
|
|
|
- ✅
|
|
|
|
- ✗
|
|
|
|
- ✅
|
|
|
|
-
|
|
|
|
-
|
|
|
|
-
|
|
|
|
-
|
|
|
|
-
|
|
|
|
-
|
|
|
|
-
|
|
|
|
-
|
|
|
|
-
|
|
|
|
-
|
|
|
|
-
|
|
|
|
-
|
|
|
|
* - CUDA graph
|
|
|
|
- ✅
|
|
|
|
- ✅
|
|
|
|
- ✅
|
|
|
|
- ✅
|
|
|
|
- ✅
|
|
|
|
-
|
|
|
|
-
|
|
|
|
-
|
|
|
|
-
|
|
|
|
-
|
|
|
|
-
|
|
|
|
-
|
|
|
|
-
|
|
|
|
-
|
|
|
|
-
|
|
|
|
-
|
|
|
|
* - <abbr title="Pooling Models">pooling</abbr>
|
|
|
|
- ✗
|
|
|
|
- ✗
|
|
|
|
- ✗
|
|
|
|
- ✗
|
|
|
|
- ✗
|
|
|
|
- ✗
|
|
|
|
-
|
|
|
|
-
|
|
|
|
-
|
|
|
|
-
|
|
|
|
-
|
|
|
|
-
|
|
|
|
-
|
|
|
|
-
|
|
|
|
-
|
|
|
|
-
|
|
|
|
* - <abbr title="Encoder-Decoder Models">enc-dec</abbr>
|
|
|
|
- ✗
|
2024-12-26 06:49:26 +08:00
|
|
|
- [✗](gh-issue:7366)
|
2024-12-23 17:35:38 -05:00
|
|
|
- ✗
|
|
|
|
- ✗
|
2024-12-26 06:49:26 +08:00
|
|
|
- [✗](gh-issue:7366)
|
2024-12-23 17:35:38 -05:00
|
|
|
- ✅
|
|
|
|
- ✅
|
|
|
|
-
|
|
|
|
-
|
|
|
|
-
|
|
|
|
-
|
|
|
|
-
|
|
|
|
-
|
|
|
|
-
|
|
|
|
-
|
|
|
|
-
|
|
|
|
* - <abbr title="Logprobs">logP</abbr>
|
|
|
|
- ✅
|
|
|
|
- ✅
|
|
|
|
- ✅
|
|
|
|
- ✅
|
|
|
|
- ✅
|
|
|
|
- ✅
|
|
|
|
- ✗
|
|
|
|
- ✅
|
|
|
|
-
|
|
|
|
-
|
|
|
|
-
|
|
|
|
-
|
|
|
|
-
|
|
|
|
-
|
|
|
|
-
|
|
|
|
-
|
|
|
|
* - <abbr title="Prompt Logprobs">prmpt logP</abbr>
|
|
|
|
- ✅
|
|
|
|
- ✅
|
|
|
|
- ✅
|
|
|
|
- ✅
|
2024-12-26 06:49:26 +08:00
|
|
|
- [✗](gh-pr:8199)
|
2024-12-23 17:35:38 -05:00
|
|
|
- ✅
|
|
|
|
- ✗
|
|
|
|
- ✅
|
|
|
|
- ✅
|
|
|
|
-
|
|
|
|
-
|
|
|
|
-
|
|
|
|
-
|
|
|
|
-
|
|
|
|
-
|
|
|
|
-
|
|
|
|
* - <abbr title="Async Output Processing">async output</abbr>
|
|
|
|
- ✅
|
|
|
|
- ✅
|
|
|
|
- ✅
|
|
|
|
- ✅
|
|
|
|
- ✗
|
|
|
|
- ✅
|
|
|
|
- ✗
|
|
|
|
- ✗
|
|
|
|
- ✅
|
|
|
|
- ✅
|
|
|
|
-
|
|
|
|
-
|
|
|
|
-
|
|
|
|
-
|
|
|
|
-
|
|
|
|
-
|
|
|
|
* - multi-step
|
|
|
|
- ✗
|
|
|
|
- ✅
|
|
|
|
- ✗
|
|
|
|
- ✅
|
|
|
|
- ✗
|
|
|
|
- ✅
|
|
|
|
- ✗
|
|
|
|
- ✗
|
|
|
|
- ✅
|
2024-12-26 06:49:26 +08:00
|
|
|
- [✗](gh-issue:8198)
|
2024-12-23 17:35:38 -05:00
|
|
|
- ✅
|
|
|
|
-
|
|
|
|
-
|
|
|
|
-
|
|
|
|
-
|
|
|
|
-
|
|
|
|
* - <abbr title="Multimodal Inputs">mm</abbr>
|
|
|
|
- ✅
|
2024-12-26 06:49:26 +08:00
|
|
|
- [✗](gh-pr:8348)
|
|
|
|
- [✗](gh-pr:7199)
|
2024-12-23 17:35:38 -05:00
|
|
|
- ?
|
|
|
|
- ?
|
|
|
|
- ✅
|
|
|
|
- ✅
|
|
|
|
- ✅
|
|
|
|
- ✅
|
|
|
|
- ✅
|
|
|
|
- ✅
|
|
|
|
- ?
|
|
|
|
-
|
|
|
|
-
|
|
|
|
-
|
|
|
|
-
|
|
|
|
* - best-of
|
|
|
|
- ✅
|
|
|
|
- ✅
|
|
|
|
- ✅
|
|
|
|
- ✅
|
2024-12-26 06:49:26 +08:00
|
|
|
- [✗](gh-issue:6137)
|
2024-12-23 17:35:38 -05:00
|
|
|
- ✅
|
|
|
|
- ✗
|
|
|
|
- ✅
|
|
|
|
- ✅
|
|
|
|
- ✅
|
|
|
|
- ?
|
2024-12-26 06:49:26 +08:00
|
|
|
- [✗](gh-issue:7968)
|
2024-12-23 17:35:38 -05:00
|
|
|
- ✅
|
|
|
|
-
|
|
|
|
-
|
|
|
|
-
|
|
|
|
* - beam-search
|
|
|
|
- ✅
|
|
|
|
- ✅
|
|
|
|
- ✅
|
|
|
|
- ✅
|
2024-12-26 06:49:26 +08:00
|
|
|
- [✗](gh-issue:6137)
|
2024-12-23 17:35:38 -05:00
|
|
|
- ✅
|
|
|
|
- ✗
|
|
|
|
- ✅
|
|
|
|
- ✅
|
|
|
|
- ✅
|
|
|
|
- ?
|
2024-12-26 06:49:26 +08:00
|
|
|
- [✗](gh-issue:7968>)
|
2024-12-23 17:35:38 -05:00
|
|
|
- ?
|
|
|
|
- ✅
|
|
|
|
-
|
|
|
|
-
|
|
|
|
* - <abbr title="Guided Decoding">guided dec</abbr>
|
|
|
|
- ✅
|
|
|
|
- ✅
|
|
|
|
- ?
|
|
|
|
- ?
|
|
|
|
- ✅
|
|
|
|
- ✅
|
|
|
|
- ✗
|
|
|
|
- ?
|
|
|
|
- ✅
|
|
|
|
- ✅
|
|
|
|
- ✅
|
2024-12-26 06:49:26 +08:00
|
|
|
- [✗](gh-issue:9893)
|
2024-12-23 17:35:38 -05:00
|
|
|
- ?
|
|
|
|
- ✅
|
|
|
|
- ✅
|
|
|
|
-
|
|
|
|
|
|
|
|
```
|
|
|
|
|
|
|
|
### Feature x Hardware
|
|
|
|
|
|
|
|
```{list-table}
|
|
|
|
:header-rows: 1
|
|
|
|
:stub-columns: 1
|
|
|
|
:widths: auto
|
|
|
|
|
|
|
|
* - Feature
|
|
|
|
- Volta
|
|
|
|
- Turing
|
|
|
|
- Ampere
|
|
|
|
- Ada
|
|
|
|
- Hopper
|
|
|
|
- CPU
|
|
|
|
- AMD
|
|
|
|
* - [CP](#chunked-prefill)
|
2024-12-26 06:49:26 +08:00
|
|
|
- [✗](gh-issue:2729)
|
2024-12-23 17:35:38 -05:00
|
|
|
- ✅
|
|
|
|
- ✅
|
|
|
|
- ✅
|
|
|
|
- ✅
|
|
|
|
- ✅
|
|
|
|
- ✅
|
|
|
|
* - [APC](#apc)
|
2024-12-26 06:49:26 +08:00
|
|
|
- [✗](gh-issue:3687)
|
2024-12-23 17:35:38 -05:00
|
|
|
- ✅
|
|
|
|
- ✅
|
|
|
|
- ✅
|
|
|
|
- ✅
|
|
|
|
- ✅
|
|
|
|
- ✅
|
|
|
|
* - [LoRA](#lora-adapter)
|
|
|
|
- ✅
|
|
|
|
- ✅
|
|
|
|
- ✅
|
|
|
|
- ✅
|
|
|
|
- ✅
|
2024-12-26 06:49:26 +08:00
|
|
|
- [✗](gh-pr:4830)
|
2024-12-23 17:35:38 -05:00
|
|
|
- ✅
|
|
|
|
* - <abbr title="Prompt Adapter">prmpt adptr</abbr>
|
|
|
|
- ✅
|
|
|
|
- ✅
|
|
|
|
- ✅
|
|
|
|
- ✅
|
|
|
|
- ✅
|
2024-12-26 06:49:26 +08:00
|
|
|
- [✗](gh-issue:8475)
|
2024-12-23 17:35:38 -05:00
|
|
|
- ✅
|
|
|
|
* - [SD](#spec_decode)
|
|
|
|
- ✅
|
|
|
|
- ✅
|
|
|
|
- ✅
|
|
|
|
- ✅
|
|
|
|
- ✅
|
|
|
|
- ✅
|
|
|
|
- ✅
|
|
|
|
* - CUDA graph
|
|
|
|
- ✅
|
|
|
|
- ✅
|
|
|
|
- ✅
|
|
|
|
- ✅
|
|
|
|
- ✅
|
|
|
|
- ✗
|
|
|
|
- ✅
|
|
|
|
* - <abbr title="Pooling Models">pooling</abbr>
|
|
|
|
- ✅
|
|
|
|
- ✅
|
|
|
|
- ✅
|
|
|
|
- ✅
|
|
|
|
- ✅
|
|
|
|
- ✅
|
|
|
|
- ?
|
|
|
|
* - <abbr title="Encoder-Decoder Models">enc-dec</abbr>
|
|
|
|
- ✅
|
|
|
|
- ✅
|
|
|
|
- ✅
|
|
|
|
- ✅
|
|
|
|
- ✅
|
|
|
|
- ✅
|
|
|
|
- ✗
|
|
|
|
* - <abbr title="Multimodal Inputs">mm</abbr>
|
|
|
|
- ✅
|
|
|
|
- ✅
|
|
|
|
- ✅
|
|
|
|
- ✅
|
|
|
|
- ✅
|
|
|
|
- ✅
|
|
|
|
- ✅
|
|
|
|
* - <abbr title="Logprobs">logP</abbr>
|
|
|
|
- ✅
|
|
|
|
- ✅
|
|
|
|
- ✅
|
|
|
|
- ✅
|
|
|
|
- ✅
|
|
|
|
- ✅
|
|
|
|
- ✅
|
|
|
|
* - <abbr title="Prompt Logprobs">prmpt logP</abbr>
|
|
|
|
- ✅
|
|
|
|
- ✅
|
|
|
|
- ✅
|
|
|
|
- ✅
|
|
|
|
- ✅
|
|
|
|
- ✅
|
|
|
|
- ✅
|
|
|
|
* - <abbr title="Async Output Processing">async output</abbr>
|
|
|
|
- ✅
|
|
|
|
- ✅
|
|
|
|
- ✅
|
|
|
|
- ✅
|
|
|
|
- ✅
|
|
|
|
- ✗
|
|
|
|
- ✗
|
|
|
|
* - multi-step
|
|
|
|
- ✅
|
|
|
|
- ✅
|
|
|
|
- ✅
|
|
|
|
- ✅
|
|
|
|
- ✅
|
2024-12-26 06:49:26 +08:00
|
|
|
- [✗](gh-issue:8477)
|
2024-12-23 17:35:38 -05:00
|
|
|
- ✅
|
|
|
|
* - best-of
|
|
|
|
- ✅
|
|
|
|
- ✅
|
|
|
|
- ✅
|
|
|
|
- ✅
|
|
|
|
- ✅
|
|
|
|
- ✅
|
|
|
|
- ✅
|
|
|
|
* - beam-search
|
|
|
|
- ✅
|
|
|
|
- ✅
|
|
|
|
- ✅
|
|
|
|
- ✅
|
|
|
|
- ✅
|
|
|
|
- ✅
|
|
|
|
- ✅
|
|
|
|
* - <abbr title="Guided Decoding">guided dec</abbr>
|
|
|
|
- ✅
|
|
|
|
- ✅
|
|
|
|
- ✅
|
|
|
|
- ✅
|
|
|
|
- ✅
|
|
|
|
- ✅
|
|
|
|
- ✅
|
|
|
|
```
|