How to make custom GPT process an action whose response is an image

I am trying to make a custom gpt that will write poems on images generated from actions available to it. I am using the API.

Tried to create actions with this schema

  "openapi": "3.0.0",
  "info": {
    "title": "Random Cat Image API",
    "version": "1.0.0",
    "description": "API that returns a random cat image"
  "servers": [
      "url": ""
  "paths": {
    "/cat": {
      "get": {
        "summary": "Get a random cat image",
        "description": "Returns a random cat image",
        "operationId": "getRandomCat",
        "responses": {
          "200": {
            "description": "A random cat image",
            "content": {
              "image/jpeg": {
                "schema": {
                  "type": "string",
                  "format": "binary"

which generates a random cat image.

But everytime I try something it’s failing. Am I making some kind of mistake in the schema or the actions at this time just doesn’t support image/jpeg as response.


I hope it’s on the roadmap as this would open up GPTs to be what they should be. Not custom one shot prompt templates with RAG - but natural language interface to all types of external API calls (including images)

i’m in the same place, i trying to get a dummy image to test but the debuggin repsonse show a empy json “{}”.

Maybe the actions can receive other format like json or text.

Anyone did try with base64 or a link from json?

I’ll do some testing myself now and see if I can crack it

this got me further - but hit another error loading plugin. presumably PIL. maybe there’s another way.