4 Commits

Author SHA1 Message Date
Luka Govedič
30870b4f66
[torch.compile] Dynamic fp8 + rms_norm fusion (#10906)
Signed-off-by: luka <luka@neuralmagic.com>
Co-authored-by: Varun Sundar Rabindranath <varun@neuralmagic.com>
2024-12-13 03:19:23 +00:00
Luka Govedič
8b0fe06c89
[torch.compile] Inductor code caching fix (#10273)
Signed-off-by: luka <luka@neuralmagic.com>
Signed-off-by: Luka Govedic <luka.govedic@gmail.com>
2024-11-20 21:44:57 -08:00
youkaichao
4fd9375028
[2/N][torch.compile] make compilation cfg part of vllm cfg (#10383)
Signed-off-by: youkaichao <youkaichao@gmail.com>
2024-11-16 18:02:14 -08:00
Luka Govedič
4f93dfe952
[torch.compile] Fuse RMSNorm with quant (#9138)
Signed-off-by: luka <luka@neuralmagic.com>
Co-authored-by: youkaichao <youkaichao@126.com>
2024-11-08 21:20:08 +00:00