How to do few shot prompting with images in GPT-4 vision api structure? Can someone provide a code to do so?

_j · March 20, 2024, 6:05am

It would be essentially the same as sending other few-shot examples: you give an input, and you demonstrate the way the AI responds, so that it can begin following a pattern.

These latest models, such as the 1106 version of gpt-4-turbo that vision is based on, are highly-trained on chat responses, so previous input will show far less impact on behavior.

After the system message (that still needs some more demonstration to the AI), you then pass example messages as if they were chat that occurred.

One can also experiment with how the AI interprets the “name” field, where you can use a name like “example”.

multi-shot_messages = [
  {
    "role": "user",
    "content": [
      {
        "type": "text",
        "text": "Perform the programmed vision task on two images"
      },
      {
        "type": "image_url",
        "image_url": {"url": f"data:image/jpeg;base64,{base64_image1}", "detail": "low"}
      },
      {
        "type": "image_url",
        "image_url": {"url": f"data:image/png;base64,{base64_image2}", "detail": "high"}
      }
    ]
  },
  {
    "role": "assistant",
    "content": '{"similarity": 75, "commonality": "dogs"}'
      
  }
]

If you read the response the AI writes, you also can probably figure out what I want you to do…

then to build your final API call:

messages = system + multi-shot_messages + history + user_input

Topic		Replies	Views
How to do few-shot prompting interweaving text and images with Gpt-4-vision-preview as seen in "The Dawn of LMMs: Preliminary Explorations with GPT-4V(ision)"? API gpt-4 , api	6	3178	September 4, 2024
Gpt-4 vision few shot prompting with images API	3	3851	May 29, 2024
Ingesting Few-Shot examples with Structured Output Prompting api , assistants-api	4	576	June 6, 2025
GPT-4 Vision for identifying unethical Advertisements API	7	405	April 17, 2024
Few shots with multiple images API api , lost-user	1	265	January 28, 2025

How to do few shot prompting with images in GPT-4 vision api structure? Can someone provide a code to do so?

Related topics