Kyle Mistele
|
e02ce498be
|
[Feature] OpenAI-Compatible Tools API + Streaming for Hermes & Mistral models (#5649)
Co-authored-by: constellate <constellate@1-ai-appserver-staging.codereach.com>
Co-authored-by: Kyle Mistele <kyle@constellate.ai>
|
2024-09-04 13:18:13 -07:00 |
|
Roger Wang
|
5b86b19954
|
[Misc] Optional installation of audio related packages (#8063)
|
2024-09-01 14:46:57 -07:00 |
|
Kaunil Dhruv
|
058344f89a
|
[Frontend]-config-cli-args (#7737)
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
Co-authored-by: Kaunil Dhruv <kaunil_dhruv@intuit.com>
|
2024-08-30 08:21:02 -07:00 |
|
Patrick von Platen
|
6fc4e6e07a
|
[Model] Add Mistral Tokenization to improve robustness and chat encoding (#7739)
|
2024-08-27 12:40:02 +00:00 |
|
Cyrus Leung
|
baaedfdb2d
|
[mypy] Enable following imports for entrypoints (#7248)
Co-authored-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
Co-authored-by: Fei <dfdfcai4@gmail.com>
|
2024-08-20 23:28:21 -07:00 |
|
SangBin Cho
|
ff7ec82c4d
|
[Core] Optimize SPMD architecture with delta + serialization optimization (#7109)
|
2024-08-18 17:57:20 -07:00 |
|
Xander Johnson
|
7c0b7ea214
|
[Bugfix] add >= 1.0 constraint for openai dependency (#7612)
|
2024-08-16 20:56:01 -07:00 |
|
PHILO-HE
|
f4da5f7b6d
|
[Misc] Update dockerfile for CPU to cover protobuf installation (#7182)
|
2024-08-15 10:03:01 -07:00 |
|
Kyle Sayers
|
f55a9aea45
|
[Misc] Revert compressed-tensors code reuse (#7521)
|
2024-08-14 15:07:37 -07:00 |
|
youkaichao
|
16422ea76f
|
[misc][plugin] add plugin system implementation (#7426)
|
2024-08-13 16:24:17 -07:00 |
|
Kyle Sayers
|
373538f973
|
[Misc] compressed-tensors code reuse (#7277)
|
2024-08-13 19:05:15 -04:00 |
|
Peter Salas
|
00c3d68e45
|
[Frontend][Core] Add plumbing to support audio language models (#7446)
|
2024-08-13 17:39:33 +00:00 |
|
Daniele
|
774cd1d3bf
|
[CI/Build] bump minimum cmake version (#6999)
|
2024-08-12 16:29:20 -07:00 |
|
Noam Gat
|
4fb7b52a2c
|
Updating LM Format Enforcer version to v0.10.6 (#7189)
|
2024-08-11 08:11:50 -04:00 |
|
Cyrus Leung
|
7eb4a51c5f
|
[Core] Support serving encoder/decoder models (#7258)
|
2024-08-09 10:39:41 +08:00 |
|
Isotr0py
|
360bd67cf0
|
[Core] Support loading GGUF model (#5191)
Co-authored-by: Michael Goin <michael@neuralmagic.com>
|
2024-08-05 17:54:23 -06:00 |
|
Michael Goin
|
421e218b37
|
[Bugfix] Bump transformers to 4.43.2 (#6752)
|
2024-07-24 13:22:16 -07:00 |
|
Roger Wang
|
1bedf210e3
|
Bump transformers version for Llama 3.1 hotfix and patch Chameleon (#6690)
|
2024-07-23 13:47:48 -07:00 |
|
Noam Gat
|
9da4aad44b
|
Updating LM Format Enforcer version to v10.3 (#6411)
|
2024-07-13 10:09:12 +00:00 |
|
Roger Wang
|
f7160d946a
|
[Misc][Bugfix] Update transformers for tokenizer issue (#6364)
|
2024-07-12 08:40:07 +00:00 |
|
Lim Xiang Yang
|
fc17110bbe
|
[BugFix]: set outlines pkg version (#6262)
|
2024-07-11 04:37:11 +00:00 |
|
youkaichao
|
da78caecfa
|
[core][distributed] zmq fallback for broadcasting large objects (#6183)
[core][distributed] add zmq fallback for broadcasting large objects (#6183)
|
2024-07-09 18:49:11 -07:00 |
|
danieljannai21
|
2c37540aa6
|
[Frontend] Add template related params to request (#5709)
|
2024-07-01 23:01:57 -07:00 |
|
Woosuk Kwon
|
79c92c7c8a
|
[Model] Add Gemma 2 (#5908)
|
2024-06-27 13:33:56 -07:00 |
|
Cyrus Leung
|
e9c2732b97
|
[CI/Build] Add tqdm to dependencies (#5680)
|
2024-06-19 08:37:33 -06:00 |
|
youkaichao
|
f07d513320
|
[build][misc] limit numpy version (#5582)
|
2024-06-16 16:07:01 -07:00 |
|
Breno Faria
|
7b0a0dfb22
|
[Frontend][Core] Update Outlines Integration from FSM to Guide (#4109)
Co-authored-by: Simon Mo <simon.mo@hey.com>
Co-authored-by: Breno Faria <breno.faria@intrafind.com>
|
2024-06-05 16:49:12 -07:00 |
|
Cyrus Leung
|
7a64d24aad
|
[Core] Support image processor (#4197)
|
2024-06-02 22:56:41 -07:00 |
|
Michael Goin
|
86b45ae065
|
[Bugfix] Relax tiktoken to >= 0.6.0 (#4890)
|
2024-05-17 12:58:52 -06:00 |
|
Alex Wu
|
52f8107cf2
|
[Frontend] Support OpenAI batch file format (#4794)
Co-authored-by: Robert Shaw <114415538+robertgshaw2-neuralmagic@users.noreply.github.com>
|
2024-05-15 19:13:36 -04:00 |
|
Noam Gat
|
bd99d22629
|
Update lm-format-enforcer to 0.10.1 (#4631)
|
2024-05-06 23:51:59 +00:00 |
|
Ronen Schaffer
|
bf480c5302
|
Add more Prometheus metrics (#2764)
Co-authored-by: Robert Shaw <114415538+robertgshaw2-neuralmagic@users.noreply.github.com>
Co-authored-by: Robert Shaw <rshaw@neuralmagic.com>
|
2024-04-28 15:59:33 -07:00 |
|
Cyrus Leung
|
8947bc3c15
|
[Frontend][Bugfix] Disallow extra fields in OpenAI API (#4355)
|
2024-04-27 05:08:24 +00:00 |
|
Noam Gat
|
cc74b2b232
|
Updating lm-format-enforcer version and adding links to decoding libraries in docs (#4222)
|
2024-04-20 08:33:16 +00:00 |
|
Nick Hill
|
87fa80c91f
|
[Misc] Bump transformers to latest version (#4176)
|
2024-04-18 14:36:39 -07:00 |
|
Noam Gat
|
05434764cd
|
LM Format Enforcer Guided Decoding Support (#3868)
Co-authored-by: Simon Mo <simon.mo@hey.com>
|
2024-04-16 05:54:57 +00:00 |
|
Zhuohan Li
|
e11e200736
|
[Bugfix] Fix filelock version requirement (#4075)
|
2024-04-14 21:50:08 -07:00 |
|
SangBin Cho
|
09473ee41c
|
[mypy] Add mypy type annotation part 1 (#4006)
|
2024-04-12 14:35:50 -07:00 |
|
SangBin Cho
|
18de883489
|
[Chunked Prefill][4/n] Chunked prefill scheduler. (#3853)
|
2024-04-05 10:17:58 -07:00 |
|
Woosuk Kwon
|
cfaf49a167
|
[Misc] Define common requirements (#3841)
|
2024-04-05 00:39:17 -07:00 |
|