vllm/entrypoints at 051eaf6db3d8feeb0779a4e942aadc85eda2f8b2 - vllm - Luminance Code Repo

20231088/vllm

History

Cyrus Leung 051eaf6db3

[Model] Add user-configurable task for models that support both generation and embedding (#9424 )

2024-10-18 11:31:58 -07:00

..

[Model] Add user-configurable task for models that support both generation and embedding (#9424 )

2024-10-18 11:31:58 -07:00

[Bugfix][Core] Use torch.cuda.memory_stats() to profile peak memory usage (#9352 )

2024-10-17 22:47:27 -04:00

[Model] Add user-configurable task for models that support both generation and embedding (#9424 )

2024-10-18 11:31:58 -07:00

__init__.py

[CI/Build] Move test_utils.py to tests/utils.py (#4425 )

2024-05-13 23:50:09 +09:00

conftest.py

Support for guided decoding for offline LLM (#6878 )

2024-08-04 03:12:09 +00:00

test_chat_utils.py

[Model] Add user-configurable task for models that support both generation and embedding (#9424 )

2024-10-18 11:31:58 -07:00