MCP: can a tool return an image (media) to the LLM?

Does the standard Responses API or Chat Completions API support returning the actual media content (like PNG/JPG image bytes/data) as the result of a tool/function call, instead of just a URL, so the LLM can directly see and analyze the file?

Not internal MCP, but OpenAI finally, and only on Responses, recently allowed placement of images in the function return message that comes from your code.

You can have a function or automatic function definitions by MCP subscription that are actually serviced by an MCP server that you make the API call to. Then you’d be able to send an image in the tool return (along with some AI-targeted messaging of why its there) when an MCP service is configured to also transmit images.


An example use of images as tool input is seen in ChatGPT, impossible on the API because of the feature lockdown and container content lockdown: Code interpreter that can have images returned within reasoning, and then the AI itself loads, crops, zooms to try to get a better view. (which would be an absolute money-burner on the API; you as API developer can deliver optimized sliced images without any AI desparation and futility).