Image response from an MCP server with agents sdk

Hi, I have an agent based on the openai-agents framework and I’m using an MCP server that returns an image when called.

Below is a snippet from the MCP server, where types.ImageContent is from mcp library which I assume to be the standard while outputting an image.

png = base64.b64encode(png).decode("utf-8")
return [types.ImageContent(type="image", data=png, mimeType="image/png")]

The problem is that the agent/model is not interpreting this as an image and instead it is interpreting this as a string(text), and because the png is the string-formatted image, the token usage bumps up(around 100k) because the agent is treating this as text.

I believe there should be an adapter that converts image from MCP defined format i.e., types.ImageContent to openai models accepted format. Is there anything like that? I searched a lot for this but couldn’t find any lead.

Below is the Screenshot from the traces

Does OpenAI support images as response from tools/functions?

2 Likes