Cyrus Leung
|
5bf35a91e4
|
[Doc][CI/Build] Update docs and tests to use vllm serve (#6431)
|
2024-07-17 07:43:21 +00:00 |
|
Harry Mellor
|
fe7d648fe5
|
Don't show default value for flags in EngineArgs (#4223)
Co-authored-by: Harry Mellor <hmellor@oxts.com>
|
2024-04-21 09:15:28 -07:00 |
|
Harry Mellor
|
682789d402
|
Fix missing docs and out of sync EngineArgs (#4219)
Co-authored-by: Harry Mellor <hmellor@oxts.com>
|
2024-04-19 20:51:33 -07:00 |
|
Sanger Steel
|
d619ae2d19
|
[Doc] Add better clarity for tensorizer usage (#4090)
Co-authored-by: Roger Wang <136131678+ywang96@users.noreply.github.com>
|
2024-04-15 13:28:25 -07:00 |
|
Sanger Steel
|
711a000255
|
[Frontend] [Core] feat: Add model loading using tensorizer (#3476)
|
2024-04-13 17:13:01 -07:00 |
|
Sean Gallen
|
78107fa091
|
[Doc]Add asynchronous engine arguments to documentation. (#3810)
Co-authored-by: Simon Mo <simon.mo@hey.com>
Co-authored-by: Roger Wang <136131678+ywang96@users.noreply.github.com>
|
2024-04-04 21:52:01 -07:00 |
|
Sage Moore
|
ce4f5a29fb
|
Add Automatic Prefix Caching (#2762)
Co-authored-by: ElizaWszola <eliza@neuralmagic.com>
Co-authored-by: Michael Goin <michael@neuralmagic.com>
|
2024-03-02 00:50:01 -08:00 |
|
Suhong Moon
|
3ec8c25cd0
|
[Docs] Update documentation for gpu-memory-utilization option (#2162)
|
2023-12-17 10:51:57 -08:00 |
|
Casper
|
a921d8be9d
|
[DOCS] Add engine args documentation (#1741)
|
2023-11-22 12:31:27 -08:00 |
|