Mark McLoughlin
|
46fb056749
|
[V1][Metrics] Add TTFT and TPOT histograms (#12530)
Signed-off-by: Mark McLoughlin <markmc@redhat.com>
|
2025-01-29 04:11:16 +00:00 |
|
Mark McLoughlin
|
c386c43ca3
|
[V1][Metrics] Add per-request prompt/generation_tokens histograms (#12516)
Signed-off-by: Mark McLoughlin <markmc@redhat.com>
|
2025-01-28 22:07:22 +00:00 |
|
Mark McLoughlin
|
3fd1fb63ef
|
[V1][Metrics] Hook up IterationStats for Prometheus metrics (#12478)
Signed-off-by: Mark McLoughlin <markmc@redhat.com>
|
2025-01-28 16:38:38 +00:00 |
|
Mark McLoughlin
|
01ba927040
|
[V1][Metrics] Add initial Prometheus logger (#12416)
Signed-off-by: Mark McLoughlin <markmc@redhat.com>
|
2025-01-27 12:26:28 -05:00 |
|
tomeras91
|
ac04a97a9f
|
[Frontend] Add max_tokens prometheus metric (#9881)
Signed-off-by: Tomer Asida <tomera@ai21.com>
|
2024-11-04 22:53:24 +00:00 |
|
Cyrus Leung
|
06386a64dd
|
[Frontend] Chat-based Embeddings API (#9759)
|
2024-11-01 08:13:35 +00:00 |
|
youkaichao
|
cbc2ef5529
|
[misc] hide best_of from engine (#9261)
Co-authored-by: Brendan Wong <bjwpokemon@gmail.com>
|
2024-10-10 21:30:44 -07:00 |
|
Nick Hill
|
39178c7fbc
|
[Tests] Disable retries and use context manager for openai client (#7565)
|
2024-08-26 21:33:17 -07:00 |
|
Pooya Davoodi
|
8da48e4d95
|
[Frontend] Publish Prometheus metrics in run_batch API (#7641)
|
2024-08-23 23:04:22 -07:00 |
|
Robert Shaw
|
e3b318216d
|
[ Bugfix ] Fix Prometheus Metrics With zeromq Frontend (#7279)
Co-authored-by: Nick Hill <nickhill@us.ibm.com>
|
2024-08-18 20:19:48 +00:00 |
|