[Doc] Update description of vLLM support for CPUs (#6003)

This commit is contained in:
Jie Fu (傅杰) 2024-07-11 12:15:29 +08:00 committed by GitHub
parent 99ded1e1c4
commit 439c84581a
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194
2 changed files with 2 additions and 2 deletions

View File

@ -59,7 +59,7 @@ vLLM is flexible and easy to use with:
- Tensor parallelism support for distributed inference
- Streaming outputs
- OpenAI-compatible API server
- Support NVIDIA GPUs, AMD GPUs, Intel CPUs and GPUs
- Support NVIDIA GPUs, AMD CPUs and GPUs, Intel CPUs and GPUs, PowerPC CPUs
- (Experimental) Prefix caching support
- (Experimental) Multi-lora support

View File

@ -20,7 +20,7 @@ Requirements
* OS: Linux
* Compiler: gcc/g++>=12.3.0 (optional, recommended)
* Instruction set architecture (ISA) requirement: AVX512 is required.
* Instruction set architecture (ISA) requirement: AVX512 (optional, recommended)
.. _cpu_backend_quick_start_dockerfile: