[doc] update pipeline parallel in readme (#6347)
This commit is contained in:
parent
1df43de9bb
commit
2d23b42d92
@ -56,7 +56,7 @@ vLLM is flexible and easy to use with:
|
||||
|
||||
- Seamless integration with popular Hugging Face models
|
||||
- High-throughput serving with various decoding algorithms, including *parallel sampling*, *beam search*, and more
|
||||
- Tensor parallelism support for distributed inference
|
||||
- Tensor parallelism and pipieline parallelism support for distributed inference
|
||||
- Streaming outputs
|
||||
- OpenAI-compatible API server
|
||||
- Support NVIDIA GPUs, AMD CPUs and GPUs, Intel CPUs and GPUs, PowerPC CPUs
|
||||
|
@ -38,7 +38,7 @@ vLLM is flexible and easy to use with:
|
||||
|
||||
* Seamless integration with popular HuggingFace models
|
||||
* High-throughput serving with various decoding algorithms, including *parallel sampling*, *beam search*, and more
|
||||
* Tensor parallelism support for distributed inference
|
||||
* Tensor parallelism and pipieline parallelism support for distributed inference
|
||||
* Streaming outputs
|
||||
* OpenAI-compatible API server
|
||||
* Support NVIDIA GPUs and AMD GPUs
|
||||
|
Loading…
x
Reference in New Issue
Block a user