youkaichao
|
f842a7aff1
|
[misc] remove engine_use_ray (#8126)
|
2024-09-11 18:23:36 -07:00 |
|
Nick Hill
|
c75363fbc0
|
[BugFix] Avoid premature async generator exit and raise all exception variations (#7698)
|
2024-08-21 11:45:55 -04:00 |
|
Wallas Henrique
|
70b746efcf
|
[Misc] Deprecation Warning when setting --engine-use-ray (#7424)
Signed-off-by: Wallas Santos <wallashss@ibm.com>
Co-authored-by: youkaichao <youkaichao@gmail.com>
Co-authored-by: Nick Hill <nickhill@us.ibm.com>
Co-authored-by: youkaichao <youkaichao@126.com>
|
2024-08-14 09:44:27 -07:00 |
|
Murali Andoorveedu
|
c5832d2ae9
|
[Core] Pipeline Parallel Support (#4412)
Signed-off-by: Muralidhar Andoorveedu <muralidhar.andoorveedu@centml.ai>
|
2024-07-02 10:58:08 -07:00 |
|
zifeitong
|
78687504f7
|
[Bugfix] AsyncLLMEngine hangs with asyncio.run (#5654)
|
2024-06-19 13:57:12 -07:00 |
|
Cyrus Leung
|
5ae5ed1e60
|
[Core] Consolidate prompt arguments to LLM engines (#4328)
Co-authored-by: Roger Wang <ywang@roblox.com>
|
2024-05-28 13:29:31 -07:00 |
|
Roy
|
7134303cbb
|
[Bugfix][Core] Fix get decoding config from ray (#4335)
|
2024-04-27 11:30:08 +00:00 |
|
Roy
|
9e8744a545
|
[BugFix] Fix get tokenizer when using ray (#3301)
|
2024-03-10 19:17:16 -07:00 |
|
Antoni Baum
|
ff578cae54
|
Add health check, make async Engine more robust (#3015)
Co-authored-by: Zhuohan Li <zhuohan123@gmail.com>
|
2024-03-04 22:01:40 +00:00 |
|
Antoni Baum
|
9b945daaf1
|
[Experimental] Add multi-LoRA support (#1804)
Co-authored-by: Chen Shen <scv119@gmail.com>
Co-authored-by: Shreyas Krishnaswamy <shrekris@anyscale.com>
Co-authored-by: Avnish Narayan <avnish@anyscale.com>
|
2024-01-23 15:26:37 -08:00 |
|
Zhuohan Li
|
ba0bfd40e2
|
TP/quantization/weight loading refactor part 1 - Simplify parallel linear logic (#1181)
|
2023-10-02 15:36:09 -07:00 |
|
Antoni Baum
|
ff36139ffc
|
Remove AsyncLLMEngine busy loop, shield background task (#1059)
|
2023-09-17 00:29:08 -07:00 |
|