20231088/vllm

History

[Kernel][Backend][Model] Blocksparse flash attention kernel and Phi-3-Small model (#4799 )

Co-authored-by: beagleski <yunanzhang@microsoft.com>
Co-authored-by: bapatra <bapatra@microsoft.com>
Co-authored-by: Barun Patra <codedecde@users.noreply.github.com>
Co-authored-by: Michael Goin <michael@neuralmagic.com>

2024-05-24 22:00:52 -07:00

source

[Kernel][Backend][Model] Blocksparse flash attention kernel and Phi-3-Small model (#4799 )

2024-05-24 22:00:52 -07:00

make.bat

Add initial sphinx docs (#120 )

2023-05-22 17:02:44 -07:00

Makefile

Add initial sphinx docs (#120 )

2023-05-22 17:02:44 -07:00

README.md

Update README.md (#306 )

2023-06-29 06:52:15 -07:00

requirements-docs.txt

[Bugfix] Fix CLI arguments in OpenAI server docs (#4729 )

2024-05-11 00:00:56 +09:00

README.md

vLLM documents

Build the docs

# Install dependencies.
pip install -r requirements-docs.txt

# Build the docs.
make clean
make html

Open the docs with your browser

python -m http.server -d build/html/

Launch your browser and open localhost:8000.