[Docs] Mention model_impl arg when explaining Transformers fallback (#14552)

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
This commit is contained in:
Harry Mellor 2025-03-10 13:13:10 +01:00 committed by GitHub
parent 460f553a6d
commit 60a98b2de5
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194

View File

@ -59,6 +59,10 @@ llm.apply_model(lambda model: print(type(model)))
If it is `TransformersModel` then it means it's based on Transformers!
:::{tip}
You can force the use of `TransformersModel` by setting `model_impl="transformers"` for <project:#offline-inference> or `--model-impl transformers` for the <project:#openai-compatible-server>.
:::
:::{note}
vLLM may not fully optimise the Transformers implementation so you may see degraded performance if comparing a native model to a Transformers model in vLLM.
:::