It is possible to have better performance by using few-shot prompting with image inputs and structured outputs?

lmaurelli · March 20, 2025, 4:14pm

I saw some samples with structured ouputs and few-shot prompting examples to enhance performances on text inputs. What about image inputs? Can I somehow tweak how/what the model see on a image by using few-shot prompting?

Also on the technical side, I searched the implementation of this use case, but I did not find a complete example yet. Digging into a sample implementation I found out that image inputs can be only user messages, while few-shot prompting makes use of system/developer messages. Is it possible technically?

Topic		Replies	Views
Gpt-4 vision few shot prompting with images API	3	3288	May 29, 2024
Few shots with multiple images API api , lost-user	1	134	January 28, 2025
How to add correct examples for image-to-text task Prompting gpt-4-vision	5	2163	December 29, 2023
Few-Shot Prompting with Structured Outputs Prompting gpt-4 , chatgpt , api	1	1008	December 7, 2024
How to do few-shot prompting interweaving text and images with Gpt-4-vision-preview as seen in "The Dawn of LMMs: Preliminary Explorations with GPT-4V(ision)"? API gpt-4 , api	6	2986	September 4, 2024

It is possible to have better performance by using few-shot prompting with image inputs and structured outputs?

Related topics