Agents that can generate images

dexter.mancuso · April 27, 2025, 9:38pm

I’m using the Agents SDK and was wondering how I could get an agent to generate an image. You can’t set the model of an agent to an image gen model like dall-e-3 right? So I need to make a function tool like:

@function_tool
async def generate_image(prompt: str) -> str:
  # do image gen here using something like openai.image.generate
  return image_url

and then have the Agent use that tool. And would I also have a custom output class like:

class ImageResult(BaseModel):
    url: str

and then output_type=ImageResult? Would I need to nudge the agent in the instructions to return the image url to the user?

_j · April 27, 2025, 9:46pm

You could be supplying an image URL link (one of your own, not OpenAI’s expiring link of dall-e-3 URL response) and hope the AI repeats it successfully, or makes a useful markdown link.

Better would be to seamlessly display those images in a user interface, and have the interactivity there.

Functions provide a service to the AI; it must find them useful for a task. “make_AI_images(prompt)” - pretty easy to understand the utility.

The function cannot return images. You would just return “1 image successfully generated and displayed for user” or a message that produces the needed AI language.

“Make an Image” is usually the end of any agentic path. You can stop the flow at that point. Then include the tool response when you are adding the next user input, just to let the AI know it was successful.

dexter.mancuso · April 27, 2025, 10:05pm

You can’t return the generated image’s url in a function tool? It seems like one of the purposes of these function tools is to interact with external APIs. In the Agents SDK Tools documentation, one of the examples is fetching weather data from an external weather API and returning the result.

Am I missing something here? Is what I proposed in my original post not feasible, not the intention of function tools or just bad practice?

I would expect the agent’s output result would be {“url”: “https://theurlofthegenerateddalle3image.png”}

_j · April 27, 2025, 10:25pm

Sure, you can have the AI write “I just made you an image. Now you have to click the link. That is, if you are even using a web browser and not a sandboxed app. Hope that I repeated it back correctly for you.”

It simply isn’t a good user interface.

Here’s the playground doing that.

maggiegeorges254 · June 22, 2025, 7:49am

Hello, I’m having a challenge in image generation, the agent yes will generate using dalle3, but the image doesn’t relate to my data of images. Like I need it to use my images to generate posters and images that I want. How do I acheive this please?

Topic		Replies	Views
How do you make an assistant generate pictures after collecting information about the user? Prompting chatgpt , assistants	5	2043	August 23, 2024
How to Implement DALL-E Graphical Responses in Playground AI Assistant? API assistants-api	2	309	November 19, 2024
How to display an image generated by assistant? API image-generation , assistants , assistants-api	3	2031	June 21, 2025
Is it possible to get the Assistants API to generate images API	4	4309	June 22, 2024
How to upload generated image back into context using Agents SDK? API image-reading	1	219	June 27, 2025

Agents that can generate images

Related topics