vllm/docs/source/index.rst

Welcome to vLLM!
================

**vLLM** is a fast and easy-to-use library for LLM inference and serving.
Its core features include:

- State-of-the-art performance in serving throughput
- Efficient management of attention key and value memory with **PagedAttention**
- Seamless integration with popular HuggingFace models
- Dynamic batching of incoming requests
- Optimized CUDA kernels
- High-throughput serving with various decoding algorithms, including *parallel sampling* and *beam search*
- Tensor parallelism support for distributed inference
- Streaming outputs
- OpenAI-compatible API server

For more information, please refer to our `blog post <>`_.


Documentation
-------------

.. toctree::
   :maxdepth: 1
   :caption: Getting Started

   getting_started/installation
   getting_started/quickstart

.. toctree::
   :maxdepth: 1
   :caption: Models

   models/supported_models
   models/adding_model
Change the name to vLLM (#150) 2023-06-17 03:07:40 -07:00			`Welcome to vLLM!`
			`================`
Add initial sphinx docs (#120) 2023-05-22 17:02:44 -07:00
Write README and front page of doc (#147) 2023-06-18 03:19:38 -07:00			`vLLM is a fast and easy-to-use library for LLM inference and serving.`
			`Its core features include:`

			`- State-of-the-art performance in serving throughput`
			`- Efficient management of attention key and value memory with PagedAttention`
			`- Seamless integration with popular HuggingFace models`
			`- Dynamic batching of incoming requests`
			`- Optimized CUDA kernels`
			`- High-throughput serving with various decoding algorithms, including parallel sampling and beam search`
			`- Tensor parallelism support for distributed inference`
			`- Streaming outputs`
			`- OpenAI-compatible API server`

			For more information, please refer to our `blog post <>`_.

Add quickstart guide (#148) 2023-06-18 01:26:12 +08:00
Add initial sphinx docs (#120) 2023-05-22 17:02:44 -07:00			`Documentation`
			`-------------`

			`.. toctree::`
			`:maxdepth: 1`
			`:caption: Getting Started`

			`getting_started/installation`
			`getting_started/quickstart`
Document supported models (#127) 2023-06-02 22:35:17 -07:00
			`.. toctree::`
			`:maxdepth: 1`
			`:caption: Models`

			`models/supported_models`
			`models/adding_model`