chenqianfzh
|
9855b99502
|
[Feature][kernel] tensor parallelism with bitsandbytes quantization (#8434)
|
2024-09-17 08:09:12 -07:00 |
|
youkaichao
|
a2469127db
|
[misc][ci] fix quant test (#8449)
|
2024-09-13 17:20:14 +08:00 |
|
chenqianfzh
|
4664ceaad6
|
support bitsandbytes 8-bit and FP4 quantized models (#7445)
|
2024-08-29 19:09:08 -04:00 |
|
dongmao zhang
|
87525fab92
|
[bitsandbytes]: support read bnb pre-quantized model (#5753)
Co-authored-by: Michael Goin <michael@neuralmagic.com>
|
2024-07-23 23:45:09 +00:00 |
|
Michael Goin
|
23ec72fa03
|
[CI/Build][REDO] Add is_quant_method_supported to control quantization test configurations (#5466)
|
2024-06-13 15:18:08 +00:00 |
|
Simon Mo
|
e3c12bf6d2
|
Revert "[CI/Build] Add is_quant_method_supported to control quantization test configurations" (#5463)
|
2024-06-12 10:03:24 -07:00 |
|
Michael Goin
|
3dd6853bc8
|
[CI/Build] Add is_quant_method_supported to control quantization test configurations (#5253)
|
2024-06-12 09:58:02 -07:00 |
|
youkaichao
|
8ea5e44a43
|
[CI/Test] improve robustness of test (vllm_runner) (#5357)
[CI/Test] improve robustness of test by replacing del with context manager (vllm_runner) (#5357)
|
2024-06-08 08:59:20 +00:00 |
|
chenqianfzh
|
b9c0605a8e
|
[Feature][Kernel] Support bitsandbytes quantization and QLoRA (#4776)
|
2024-06-01 14:51:10 -06:00 |
|