vllm/examples/phi3v_example.py

from vllm import LLM, SamplingParams
from vllm.assets.image import ImageAsset


def run_phi3v():
    model_path = "microsoft/Phi-3-vision-128k-instruct"

    # Note: The default setting of max_num_seqs (256) and
    # max_model_len (128k) for this model may cause OOM.
    # You may lower either to run this example on lower-end GPUs.

    # In this example, we override max_num_seqs to 5 while
    # keeping the original context length of 128k.
    llm = LLM(
        model=model_path,
        trust_remote_code=True,
        max_num_seqs=5,
    )

    image = ImageAsset("cherry_blossom").pil_image

    # single-image prompt
    prompt = "<|user|>\n<|image_1|>\nWhat is the season?<|end|>\n<|assistant|>\n"  # noqa: E501
    sampling_params = SamplingParams(temperature=0, max_tokens=64)

    outputs = llm.generate(
        {
            "prompt": prompt,
            "multi_modal_data": {
                "image": image
            },
        },
        sampling_params=sampling_params)
    for o in outputs:
        generated_text = o.outputs[0].text
        print(generated_text)


if __name__ == "__main__":
    run_phi3v()
[Model] Initialize Phi-3-vision support (#4986) 2024-06-18 10:34:33 +08:00			`from vllm import LLM, SamplingParams`
[CI/Build] vLLM cache directory for images (#6444) 2024-07-16 14:12:25 +08:00			`from vllm.assets.image import ImageAsset`
[Core] Dynamic image size support for VLMs (#5276) Signed-off-by: Xiaowei Jiang <xwjiang2010@gmail.com> Co-authored-by: Xiaowei Jiang <xwjiang2010@gmail.com> Co-authored-by: ywang96 <ywang@roblox.com> Co-authored-by: xwjiang2010 <87673679+xwjiang2010@users.noreply.github.com> Co-authored-by: Roger Wang <136131678+ywang96@users.noreply.github.com> 2024-07-03 11:34:00 +08:00
[Model] Initialize Phi-3-vision support (#4986) 2024-06-18 10:34:33 +08:00
			`def run_phi3v():`
			`model_path = "microsoft/Phi-3-vision-128k-instruct"`
[Doc] Add note about context length in Phi-3-Vision example (#5887) 2024-06-27 14:20:01 +08:00
[Misc] Update Phi-3-Vision Example (#5981) Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com> 2024-06-28 23:34:29 -07:00			`# Note: The default setting of max_num_seqs (256) and`
			`# max_model_len (128k) for this model may cause OOM.`
[vlm] Remove vision language config. (#6089) Signed-off-by: Xiaowei Jiang <xwjiang2010@gmail.com> Co-authored-by: Roger Wang <ywang@roblox.com> 2024-07-03 15:14:16 -07:00			`# You may lower either to run this example on lower-end GPUs.`

[Misc] Update Phi-3-Vision Example (#5981) Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com> 2024-06-28 23:34:29 -07:00			`# In this example, we override max_num_seqs to 5 while`
			`# keeping the original context length of 128k.`
[Model] Initialize Phi-3-vision support (#4986) 2024-06-18 10:34:33 +08:00			`llm = LLM(`
			`model=model_path,`
			`trust_remote_code=True,`
[Misc] Update Phi-3-Vision Example (#5981) Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com> 2024-06-28 23:34:29 -07:00			`max_num_seqs=5,`
[Model] Initialize Phi-3-vision support (#4986) 2024-06-18 10:34:33 +08:00			`)`

[CI/Build] vLLM cache directory for images (#6444) 2024-07-16 14:12:25 +08:00			`image = ImageAsset("cherry_blossom").pil_image`
[Model] Initialize Phi-3-vision support (#4986) 2024-06-18 10:34:33 +08:00
			`# single-image prompt`
			`prompt = "<\|user\|>\n<\|image_1\|>\nWhat is the season?<\|end\|>\n<\|assistant\|>\n" # noqa: E501`
			`sampling_params = SamplingParams(temperature=0, max_tokens=64)`

[Bugfix] Fix sampling_params passed incorrectly in Phi3v example (#5684) 2024-06-19 17:58:32 +08:00			`outputs = llm.generate(`
			`{`
			`"prompt": prompt,`
[VLM] Remove `image_input_type` from VLM config (#5852) Signed-off-by: Xiaowei Jiang <xwjiang2010@gmail.com> Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com> Co-authored-by: Roger Wang <ywang@roblox.com> 2024-07-02 00:57:09 -07:00			`"multi_modal_data": {`
			`"image": image`
			`},`
[Bugfix] Fix sampling_params passed incorrectly in Phi3v example (#5684) 2024-06-19 17:58:32 +08:00			`},`
			`sampling_params=sampling_params)`
[Model] Initialize Phi-3-vision support (#4986) 2024-06-18 10:34:33 +08:00			`for o in outputs:`
			`generated_text = o.outputs[0].text`
			`print(generated_text)`


			`if __name__ == "__main__":`
			`run_phi3v()`