youkaichao
|
241ad7b301
|
[ci] Fix sampler tests (#11922)
Signed-off-by: youkaichao <youkaichao@gmail.com>
|
2025-01-10 20:45:33 +08:00 |
|
Harry Mellor
|
d85c47d6ad
|
Replace "online inference" with "online serving" (#11923)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-01-10 12:05:56 +00:00 |
|
wangxiyuan
|
ef725feafc
|
[platform] support pytorch custom op pluggable (#11328)
Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>
|
2025-01-10 10:02:38 +00:00 |
|
cennn
|
d907be7dc7
|
[misc] remove python function call for custom activation op (#11885)
Co-authored-by: youkaichao <youkaichao@gmail.com>
|
2025-01-10 17:18:25 +08:00 |
|
youkaichao
|
d53575a5f0
|
[ci] fix gh200 tests (#11919)
Signed-off-by: youkaichao <youkaichao@gmail.com>
|
2025-01-10 16:25:17 +08:00 |
|
Kunshang Ji
|
61af633256
|
[BUGFIX] Fix UnspecifiedPlatform package name (#11916)
Signed-off-by: Kunshang Ji <kunshang.ji@intel.com>
|
2025-01-10 16:20:46 +08:00 |
|
Joe Runde
|
ac2f3f7fee
|
[Bugfix] Validate lora adapters to avoid crashing server (#11727)
Signed-off-by: Joe Runde <Joseph.Runde@ibm.com>
Co-authored-by: Jee Jee Li <pandaleefree@gmail.com>
|
2025-01-10 15:56:36 +08:00 |
|
Chen Zhang
|
cf5f000d21
|
[torch.compile] Hide KV cache behind torch.compile boundary (#11677)
Signed-off-by: Chen Zhang <zhangch99@outlook.com>
|
2025-01-10 13:14:42 +08:00 |
|
Cyrus Leung
|
3de2b1eafb
|
[Doc] Show default pooling method in a table (#11904)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-01-10 11:25:20 +08:00 |
|
Cyrus Leung
|
b844b99ad3
|
[VLM] Enable tokenized inputs for merged multi-modal processor (#11900)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-01-10 03:24:00 +00:00 |
|
Cyrus Leung
|
c3cf54dda4
|
[Doc][5/N] Move Community and API Reference to the bottom (#11896)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Co-authored-by: Simon Mo <simon.mo@hey.com>
|
2025-01-10 03:10:12 +00:00 |
|
Charles Frye
|
36f5303578
|
[Docs] Add Modal to deployment frameworks (#11907)
|
2025-01-09 23:26:37 +00:00 |
|
Cyrus Leung
|
9a228348d2
|
[Misc] Provide correct Pixtral-HF chat template (#11891)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-01-09 10:19:37 -07:00 |
|
youkaichao
|
bd82872211
|
[ci]try to fix flaky multi-step tests (#11894)
Signed-off-by: youkaichao <youkaichao@gmail.com>
|
2025-01-09 14:47:29 +00:00 |
|
wangxiyuan
|
405eb8e396
|
[platform] Allow platform specify attention backend (#11609)
Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>
Signed-off-by: Mengqing Cao <cmq0113@163.com>
Co-authored-by: Mengqing Cao <cmq0113@163.com>
|
2025-01-09 21:46:50 +08:00 |
|
Cyrus Leung
|
65097ca0af
|
[Doc] Add model development API Reference (#11884)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-01-09 09:43:40 +00:00 |
|
Ye (Charlotte) Qi
|
1d967acb45
|
[Bugfix] fix beam search input errors and latency benchmark script (#11875)
Signed-off-by: Ye Qi <yeq@meta.com>
Co-authored-by: yeq <yeq@devgpu004.lla3.facebook.com>
|
2025-01-09 17:36:39 +08:00 |
|
Cyrus Leung
|
0bd1ff4346
|
[Bugfix] Override dunder methods of placeholder modules (#11882)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-01-09 09:02:53 +00:00 |
|
youkaichao
|
310aca88c9
|
[perf]fix current stream (#11870)
Signed-off-by: youkaichao <youkaichao@gmail.com>
|
2025-01-09 07:18:21 +00:00 |
|
Guspan Tanadi
|
a732900efc
|
[Doc] Intended links Python multiprocessing library (#11878)
|
2025-01-09 05:39:39 +00:00 |
|
Cyrus Leung
|
d848800e88
|
[Misc] Move print_*_once from utils to logger (#11298)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Signed-off-by: Maxime Fournioux <55544262+mfournioux@users.noreply.github.com>
Co-authored-by: Maxime Fournioux <55544262+mfournioux@users.noreply.github.com>
|
2025-01-09 12:48:12 +08:00 |
|
Michael Goin
|
730e9592e9
|
[Doc] Recommend uv and python 3.12 for quickstart guide (#11849)
Signed-off-by: mgoin <michael@neuralmagic.com>
|
2025-01-09 11:37:48 +08:00 |
|
Maximilien de Bayser
|
1fe554bac3
|
treat do_lower_case in the same way as the sentence-transformers library (#11815)
Signed-off-by: Max de Bayser <mbayser@br.ibm.com>
|
2025-01-09 11:05:43 +08:00 |
|
Tyler Michael Smith
|
615e4a5401
|
[CI] Turn on basic correctness tests for V1 (#10864)
|
2025-01-08 21:20:44 -05:00 |
|
Simon Mo
|
3db0cafdf1
|
[Docs] Add Google Cloud Meetup (#11864)
|
2025-01-08 12:38:28 -08:00 |
|
rasmith
|
526de822d5
|
[Kernel][Triton][AMD] Use block size heuristic for avg 2.8x speedup for int8 models (#11698)
Signed-off-by: Randall Smith <Randall.Smith@amd.com>
|
2025-01-08 20:23:15 +00:00 |
|
Robert Shaw
|
56fe4c297c
|
[TPU][Quantization] TPU W8A8 (#11785)
Co-authored-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
|
2025-01-08 19:33:29 +00:00 |
|
WangErXiao
|
47de8821d3
|
[Misc]add some explanations for BlockHashType (#11847)
|
2025-01-08 18:21:30 +00:00 |
|
Cyrus Leung
|
5984499e47
|
[Doc] Expand Multimodal API Reference (#11852)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-01-08 17:14:14 +00:00 |
|
Cyrus Leung
|
ca47e176af
|
[Misc] Move some model utils into vision file (#11848)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-01-08 17:04:46 +00:00 |
|
Yan Ma
|
78f4590b60
|
[Bugfix][XPU] fix silu_and_mul (#11823)
Signed-off-by: yan ma <yan.ma@intel.com>
|
2025-01-09 00:11:50 +08:00 |
|
Li, Jiang
|
2f7024987e
|
[CI/Build][Bugfix] Fix CPU CI image clean up (#11836)
Signed-off-by: jiang1.li <jiang1.li@intel.com>
|
2025-01-08 15:18:28 +00:00 |
|
Cyrus Leung
|
6cd40a5bfe
|
[Doc][4/N] Reorganize API Reference (#11843)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-01-08 21:34:44 +08:00 |
|
Harry Mellor
|
aba8d6ee00
|
[Doc] Move examples into categories (#11840)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-01-08 13:09:53 +00:00 |
|
Cyrus Leung
|
2a0596bc48
|
[VLM] Reorganize profiling/processing-related code (#11812)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-01-08 18:59:58 +08:00 |
|
youkaichao
|
f12141170a
|
[torch.compile] consider relevant code in compilation cache (#11614)
Signed-off-by: youkaichao <youkaichao@gmail.com>
|
2025-01-08 10:46:43 +00:00 |
|
Wallas Henrique
|
cfd3219f58
|
[Hardware][Apple] Native support for macOS Apple Silicon (#11696)
Signed-off-by: Wallas Santos <wallashss@ibm.com>
Co-authored-by: Michael Goin <michael@neuralmagic.com>
|
2025-01-08 16:35:49 +08:00 |
|
Simon Mo
|
a1b2b8606e
|
[Docs] Update sponsor name: 'Novita' to 'Novita AI' (#11833)
|
2025-01-07 23:05:46 -08:00 |
|
youkaichao
|
ad9f1aa679
|
[doc] update wheels url (#11830)
Signed-off-by: youkaichao <youkaichao@gmail.com>
|
2025-01-08 14:36:49 +08:00 |
|
youkaichao
|
889e662eae
|
[misc] improve memory profiling (#11809)
Signed-off-by: youkaichao <youkaichao@gmail.com>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
|
2025-01-08 06:36:03 +00:00 |
|
Cyrus Leung
|
ef68eb28d8
|
[Bug] Fix pickling of ModelConfig when RunAI Model Streamer is used (#11825)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-01-08 13:40:09 +08:00 |
|
Simon Mo
|
259abd8953
|
[Docs] reorganize sponsorship page (#11639)
Signed-off-by: simon-mo <simon.mo@hey.com>
|
2025-01-07 21:16:08 -08:00 |
|
Jee Jee Li
|
f645eb6954
|
[Bugfix] Add checks for LoRA and CPU offload (#11810)
Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>
|
2025-01-08 13:08:48 +08:00 |
|
Ilya Lavrenov
|
f4923cb8bc
|
[OpenVINO] Fixed Docker.openvino build (#11732)
Signed-off-by: Ilya Lavrenov <ilya.lavrenov@intel.com>
|
2025-01-08 13:08:30 +08:00 |
|
Nishidha
|
b640b19cc0
|
Fixed docker build for ppc64le (#11518)
Signed-off-by: Nishidha Panpaliya <nishidha.panpaliya@partner.ibm.com>
|
2025-01-08 13:05:37 +08:00 |
|
WangErXiao
|
dc71af0a71
|
Remove the duplicate imports of MultiModalKwargs and PlaceholderRange… (#11824)
|
2025-01-08 04:09:25 +00:00 |
|
Divakar Verma
|
4d29e91be8
|
[Misc] sort torch profiler table by kernel timing (#11813)
|
2025-01-08 10:57:04 +08:00 |
|
Cyrus Leung
|
91445c7bc8
|
[Bugfix] Fix image input for Pixtral-HF (#11741)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-01-08 10:17:16 +08:00 |
|
Harry Mellor
|
5950f555a1
|
[Doc] Group examples into categories (#11782)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-01-08 09:20:12 +08:00 |
|
Jie Fu (傅杰)
|
a4e2b26856
|
[Bugfix] Significant performance drop on CPUs with --num-scheduler-steps > 1 (#11794)
|
2025-01-07 16:15:50 -08:00 |
|