2024-12-23 17:35:38 -05:00
|
|
|
(engine-args)=
|
2023-11-22 21:31:27 +01:00
|
|
|
|
2024-12-23 17:35:38 -05:00
|
|
|
# Engine Arguments
|
2023-11-22 21:31:27 +01:00
|
|
|
|
2025-03-25 05:29:34 +08:00
|
|
|
Engine arguments control the behavior of the vLLM engine.
|
|
|
|
|
|
|
|
- For [offline inference](#offline-inference), they are part of the arguments to `LLM` class.
|
|
|
|
- For [online serving](#openai-compatible-server), they are part of the arguments to `vllm serve`.
|
|
|
|
|
|
|
|
Below, you can find an explanation of every engine argument:
|
2023-11-22 21:31:27 +01:00
|
|
|
|
2025-02-08 20:25:15 +08:00
|
|
|
<!--- pyml disable-num-lines 7 no-space-in-emphasis -->
|
2024-12-23 17:35:38 -05:00
|
|
|
```{eval-rst}
|
2024-04-20 04:51:33 +01:00
|
|
|
.. argparse::
|
|
|
|
:module: vllm.engine.arg_utils
|
|
|
|
:func: _engine_args_parser
|
2024-07-17 15:43:21 +08:00
|
|
|
:prog: vllm serve
|
2024-04-21 17:15:28 +01:00
|
|
|
:nodefaultconst:
|
2024-12-23 17:35:38 -05:00
|
|
|
```
|
2024-04-04 23:52:01 -05:00
|
|
|
|
2024-12-23 17:35:38 -05:00
|
|
|
## Async Engine Arguments
|
2024-04-04 23:52:01 -05:00
|
|
|
|
2025-03-25 05:29:34 +08:00
|
|
|
Additional arguments are available to the asynchronous engine which is used for online serving:
|
2024-04-04 23:52:01 -05:00
|
|
|
|
2025-02-08 20:25:15 +08:00
|
|
|
<!--- pyml disable-num-lines 7 no-space-in-emphasis -->
|
2024-12-23 17:35:38 -05:00
|
|
|
```{eval-rst}
|
2024-04-20 04:51:33 +01:00
|
|
|
.. argparse::
|
|
|
|
:module: vllm.engine.arg_utils
|
|
|
|
:func: _async_engine_args_parser
|
2024-07-17 15:43:21 +08:00
|
|
|
:prog: vllm serve
|
2024-12-23 17:35:38 -05:00
|
|
|
:nodefaultconst:
|
|
|
|
```
|