How to Get ChatGPT to Describe Images Served by My Server?

I’m using an Express.js server along with an OpenAPI (Swagger) YAML configuration to serve images to ChatGPT. My goal is for ChatGPT to provide descriptions of these images, similar to how it describes images attached in a chat. However, I’m facing difficulties in getting ChatGPT to recognize and describe the images served by my server. Here’s a brief overview of my setup:

  1. Express.js server sends images either as binary data, a URL, or in Base64 encoding.
  2. OpenAPI YAML is configured for image generation and serving.

How can I ensure that ChatGPT correctly receives and describes the images provided by my server? Is there a specific format or method I should use to make the images recognizable to ChatGPT for description?

I used different schema

      responses:
        "200":
          content:
            image/png:
              schema:
                type: string
                format: binary

      responses:
        "200":
          content:
            application/json:
              schema:
                type: object
                properties:
                  imageUrl:
                    type: string
                    format: uri
                    description: URL of the image in PNG format

ChatGPT also fails to render some images which is well discussed in other topic but it’s not my main issue here.