How to do few-shot prompting interweaving text and images with Gpt-4-vision-preview as seen in "The Dawn of LMMs: Preliminary Explorations with GPT-4V(ision)"?

supershaneski · January 23, 2024, 11:46pm

For example, looking at this as reference

Perhaps, you can format the message like this

messages=[
    {
      "role": "user",
      "content": [
        {"type": "text", "text": "In the graph, which year has the highest average gas price for the month of June?"},
        {
          "type": "image_url",
          "image_url": {
            "url": "https://.../national_gas_price_comparison_2016-2019.jpg",
          },
         {"type": "text", "text": "This graph is a line plot for national gas price comparison from 2016 until 02/04/2019. The legend on top shows the line color of each year, red (2019), blue (2018), green (2017)  and orange (2016). Since the data is reported until Feb. 2019, only 3 years have datapoints for the month of June, 2018 (blue), 2017 (green) and 2016 (orange). Among them, blue line for 2018 is at the top for the month of June.  Hence, the year with the highest average gas price for the month of June is 2018. "},
        {
          "type": "image_url",
          "image_url": {
            "url": "https://.../national_gas_price_comparison_2015-2018.jpg",
          },
         {"type": "text", "text": "This graph is a line plot for national gas price comparison from 2015 until 12/10/2018. The legend on top shows the line color of each year, red (2018), orange (2017), green (2016)  and orange (2017). Since the data is reported until Dec. 2018, all 4 years have datapoints for the month of June. Among them, red line for 2018 is at the top for the month of June.  Hence, the year with the highest average gas price for the month of June is 2018. "},
        },
        ...
      ],
    }
  ],

Previously, I thought that the sequence of the image and text within the same message content entry does not matter. But it seems to be relevant.

Topic		Replies	Views
Gpt-4 vision few shot prompting with images API	3	4101	May 29, 2024
How to do few shot prompting with images in GPT-4 vision api structure? Can someone provide a code to do so? API	6	6269	April 23, 2024
How to add correct examples for image-to-text task Prompting gpt-4-vision	5	2375	December 29, 2023
It is possible to have better performance by using few-shot prompting with image inputs and structured outputs? Prompting gpt-4	0	92	March 20, 2025
How we can Few shot in the gpt 4 vision model API gpt-4 , chatgpt , assistants-api	0	374	May 29, 2024

How to do few-shot prompting interweaving text and images with Gpt-4-vision-preview as seen in "The Dawn of LMMs: Preliminary Explorations with GPT-4V(ision)"?

Related topics