17 Commits

Author SHA1 Message Date
youkaichao
ad34c0df0f
[core] platform agnostic executor via collective_rpc (#11256)
Signed-off-by: youkaichao <youkaichao@gmail.com>
2025-01-15 13:45:21 +08:00
Chen Zhang
cf5f000d21
[torch.compile] Hide KV cache behind torch.compile boundary (#11677)
Signed-off-by: Chen Zhang <zhangch99@outlook.com>
2025-01-10 13:14:42 +08:00
Cyrus Leung
0bd1ff4346
[Bugfix] Override dunder methods of placeholder modules (#11882)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-01-09 09:02:53 +00:00
youkaichao
889e662eae
[misc] improve memory profiling (#11809)
Signed-off-by: youkaichao <youkaichao@gmail.com>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
2025-01-08 06:36:03 +00:00
Joe Runde
2d1b9baa8f
[Bugfix] Fix request cancellation without polling (#11190) 2024-12-17 12:26:32 -08:00
youkaichao
551603feff
[core] overhaul memory profiling and fix backward compatibility (#10511)
Signed-off-by: youkaichao <youkaichao@gmail.com>
2024-12-16 13:32:25 -08:00
madt2709
34a9941620
[Bugfix] Fix load config when using bools (#9533) 2024-10-27 13:46:41 -04:00
Cyrus Leung
051eaf6db3
[Model] Add user-configurable task for models that support both generation and embedding (#9424) 2024-10-18 11:31:58 -07:00
Alex Brooks
a3691b6b5e
[Core][Frontend] Add Support for Inference Time mm_processor_kwargs (#9131)
Signed-off-by: Alex-Brooks <Alex.Brooks@ibm.com>
2024-10-08 14:12:56 +00:00
Andy Dai
5df1834895
[Bugfix] Fix order of arguments matters in config.yaml (#8960) 2024-10-05 17:35:11 +00:00
Kaunil Dhruv
058344f89a
[Frontend]-config-cli-args (#7737)
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
Co-authored-by: Kaunil Dhruv <kaunil_dhruv@intuit.com>
2024-08-30 08:21:02 -07:00
Cyrus Leung
9ba85bc152
[mypy] Misc. typing improvements (#7417) 2024-08-13 09:20:20 +08:00
Nick Hill
9a3f49ae07
[BugFix] Overhaul async request cancellation (#7111) 2024-08-07 13:21:41 +08:00
Michael Goin
d9b34baedd
[CI/Build] Add unit testing for FlexibleArgumentParser (#5798) 2024-06-25 12:18:03 -07:00
youkaichao
388596c914
[Misc][Utils] allow get_open_port to be called for multiple times (#5333) 2024-06-06 22:15:11 -07:00
Cyrus Leung
eecd864388
[Bugfix][CI/Build] Fix test and improve code for merge_async_iterators (#5096) 2024-05-29 16:02:25 -07:00
Cyrus Leung
5ae5ed1e60
[Core] Consolidate prompt arguments to LLM engines (#4328)
Co-authored-by: Roger Wang <ywang@roblox.com>
2024-05-28 13:29:31 -07:00