vllm/quantization at e42389f9d7a5e04aee3463b3e08bafdc86a9457b - vllm - Luminance Code Repo

20231088/vllm

History

Jee Jee Li 3892e58ad7

[Misc] Upgrade BNB version (#15183 )

2025-03-24 05:51:42 +00:00

..

auto_awq.md

[Docs] Add GPTQModel (#14056 )

2025-03-03 21:59:09 +00:00

bnb.md

[Misc] Upgrade BNB version (#15183 )

2025-03-24 05:51:42 +00:00

fp8.md

[Doc] Convert docs to use colon fences (#12471 )

2025-01-29 11:38:29 +08:00

gguf.md

[Model] Deepseek GGUF support (#13167 )

2025-02-27 02:08:35 -08:00

gptqmodel.md

[Docs] Add GPTQModel (#14056 )

2025-03-03 21:59:09 +00:00

index.md

[Docs] Add GPTQModel (#14056 )

2025-03-03 21:59:09 +00:00

int4.md

[Doc] int4 w4a16 example (#12585 )

2025-01-31 15:38:48 -08:00

int8.md

[Doc] int4 w4a16 example (#12585 )

2025-01-31 15:38:48 -08:00

quantized_kvcache.md

[FP8][Kernel] Dynamic kv cache scaling factors computation (#11906 )

2025-01-23 18:04:03 +00:00

supported_hardware.md

[Doc]: Improve feature tables (#13224 )

2025-02-18 18:52:39 +08:00