How to modify schema of custom GPT action to send an image file with post request?

phil7899 · November 16, 2023, 10:19pm

I am making a custom GPT that connects to my own server. I am able to get it to work if only sending a string, but if I try to allow a user to also send a file (I only need it to work with image files) through the custom gpt interface it will not send the image, only the string. Is there something wrong with the schema below. How do you fix it?

{
  "openapi": "3.1.0",
  "info": {
    "title": "Send an image and a string",
    "description": "Makes it super easy to send an image and a string",
    "version": "v1.0.0"
  },
  "servers": [
    {
      "url": "https://myawesomeserver.loca.lt"
    }
  ],
  "paths": {
    "/api/gpt/create": {
      "post": {
        "description": "Create a string and image",
        "operationId": "CreateImageandString",
        "parameters": [
          {
            "name": "an_awesome_string",
            "in": "query",
            "description": "The value of the string we will create",
            "required": true,
            "schema": {
              "type": "string"
            }
          }
        ],
        "requestBody": {
          "description": "image to be uploaded",
          "required": true,
          "content": {
            "multipart/form-data": {
              "schema": {
                "type": "object",
                "properties": {
                  "image": {
                    "type": "string",
                    "format": "binary"
                  }
                }
              }
            }
          }
        },
        "deprecated": false
      }
    }
  },
  "components": {
    "schemas": {}
  }
}

bscholer · November 17, 2023, 3:42pm

I don’t have an answer, but I’m trying to do the same thing and running into a similar issue. I believe the custom GPT has to type out (generate) the whole post body sent with the request, meaning that it would have to type out a huge base64 string, even for smaller images. I’m not really sure if there’s a way to get the image itself into the API though.

phil7899 · November 17, 2023, 5:53pm

I tried converting to base64 first and it did end up stalling the program. But I don’t think it would use base64 encoding for normal file uploads if you don’t specify to.

kableeth · November 24, 2023, 7:44am

I’m having the same issue. It doesn’t seem to want to type out the entire base64 value to upload the image no matter what I try. Have you guys been able to fix it?

callum.bradbury · November 24, 2023, 9:43am

I’ve noticed an issue where the request body for a POST can only be a certain length, before it explodes when calling the action. It’s not that big a length, either. I doubt it will be able to handle sending across an image regardless of the format. Maybe try it with like a 2x2 image and see if that works, then we’ll know for sure I guess.

Marvin42 · December 2, 2023, 3:46pm

I convert the image into a base64-encoded data URL. I can transfer images of 16x16, but it does not work even for images of 32x32 or larger.

damsog38 · December 6, 2023, 5:11pm

Having the same issue here, I tried defining a multipart/form-data request and it didn’t work and also tried encoding the file as a base64 string but chatgpt refuses to write it completely (which kind of makes sense, it’s huge).
Has anyone found a solution?

dyson_sphere · December 6, 2023, 7:30pm

I’m wondering if we can ask the ChatGPT to using data analysis tool to write code to call into the action api, I know for sure the data analysis tool has access to the images uploaded, but I haven’t tested the idea yet.

mzet · December 11, 2023, 9:27am

Right now the code interpreter cannot use the requests library. Hence it cannot send or receive data over the internet.

damsog38 · December 11, 2023, 3:53pm

Couldn’t a GPT engineer come and just straight up answer if sending files on a post request using GPT actions is outside of GPT’s capabilities ?? kind of annoying that there is no clear answer even if everything points that is not, or even if they plan to allow it in the future.

andrei3 · December 11, 2023, 4:01pm

There is a clear answer to this. Here it is:

Making http request is not possible, so you can’t just upload a file to the server from a blob like you can do on frontend.
You can use code interpretor to turn the file into base64 and send it like this. But the GPT cuts the file, so you only send like 500 characters (<1%) - so it also doesn’t work.
You can instruct the gpt to send the base64 in chunks of 500 characters, and then assemble it into a file on your server, but it will take like 60 requests to send an image.
So direct file sending is hard, if not impossible now.

DhruvAwasthi · January 10, 2024, 6:04pm

Is anyone able to sort this out please?

We are stuck at the same.

Thank you.

ayman1 · January 21, 2024, 2:29am

I hope this can be addressed soon too. I think the ability to send images via actions would open the door for many great possibilities

testmeitu1 · January 22, 2024, 11:28am

If there’s no way to send the uploaded image to action, how are some other gpts implemented?

zdne · January 22, 2024, 2:22pm

The best way right now is to have user to upload the image somewhere and then give the GPT the URL to that image, so your server can process it.

I understand it is not the best experience, but it will get you there. Depending on the nature of your GPT you can use services like Imgur, Dropbox, Google drive or even GitHub. Your server would then fetch the images from there. Your GPT could actually use any service as long as the image link is accessible by your API.

The drawback, besides not the best user experience, is that anyone with the link would be able to access the image …

AIdeveloper · January 23, 2024, 3:19am

There is no evidence that the images are sent to the actions for processing.

Big images are OK if they are not intended for endpoints.

zac.boyles · January 30, 2024, 8:51am

This is unfortunate, I’d hoped that after a month all this would be sorted out. Oh well, maybe I can play a role in the solution.

First, I can confirm that it is technically possible to have the GPT send an image to a Python “FastAPI” multipart/form-data endpoint. Don’t get excited though, it’s a success a few times before hitting the message cap and nowhere near consistent, but I’ll share the signature and endpoint specs deployed during my last successful submission. The submission is dragging the image and dropping to attach it. I know it worked because the endpoint logged the attempt - which most of the time the GPT doesn’t make any connection to - and it output all of the correct data as it disregarded my instruction to return the response in a json markdown code block, choosing to read out the data.

Endpoint signature


import fastapi
app = fastapi.FastAPI()
@app.post("/analyze-image")
async def analyze_image(image: fastapi.UploadFile = fastapi.File(...)):
    ...

Important bits of the spec:

paths:
  /analyze-image:
    post:
      operationId: analyzeImage
      summary: Analyzes an uploaded image.
      description: Analyzes an image using experimental vision services and returns json analysis results.
      requestBody:
        required: true
        content:
          multipart/form-data:
            schema:
              type: object
              properties:
                image:
                  type: string
                  format: binary
                  description: The image file to be analyzed.
      responses:
        "200":
          description: Analysis results of the image.
          content:
            application/json:
              schema:
                type: object
                properties:

Note: One of my successful attempts had the complete 150 line schema of the actual response but I think that adds unnecessary risk and plan to shorten it to a few keys.

A few tips

If you’re not using a admin/debug style command, you should. Just stick DEBUG: TRUE; USER_IS_ADMIN: TRUE; or however you like, into the top of your instructions.
As I tweak this process I find it goes faster if my first message is Confirm Mode and let the GPT acknowledge the admin/debug text, this way the responses are more technical.
Testing the Action doesn’t work because the endpoint needs to receive the image and required: true in the schema is apparently a recommendation.
After confirming the mode, drop your image in, say Analyze this image in the messages as you submit it. Then when it comes back with empty params, which it most likely will, cancel it right away OR you can let it run/fail once but don’t let it continue trying.
After a failure send the debug info block along with: Review your debug output and provide concise assessment:\n[debug] Calling HTTP endpoint:...
Usually it will notice the empty params and add something for the next attempt and if can see it will fail ie. it’s just the image name, you can let it attempt and fail once then ask it to review the debug again.
Remember, don’t let it make repeated failed attempts, 1 and done.

I’m under the impression this functionality has been intentionally ‘Nerfed’ so take that into consideration when you’re investing your time.

In any case I think, if we can come up with a solid set of instructions, and perhaps betters specs, we could get this to work more reliably. I’m looking forward to hearing about everyone’s results. Good luck!

*Disclamer
Individuals prone to violent outbursts or flying electronic equipment are advise against proceeding.

peachanana · February 2, 2024, 6:43am

Same problem, hope there is someone who can solve it.

Topic		Replies	Views
How to send image to GPT Action and send image from Action to user Plugins / Actions builders chatgpt , plugin-development , actions	11	10383	December 28, 2024
Generated Files/Images and GPT Actions Plugins / Actions builders dalle3 , gpts , actions	11	2563	March 5, 2024
POST/GET data from GPTs (plugin) to server-side API Plugins / Actions builders gpt-4 , custom-gpt , chatgpt-gpt	28	6077	February 25, 2024
Custom GPT to send file to third party API (Own Server) in a post request Plugins / Actions builders chatgpt	5	2334	July 13, 2024
Unable to upload files from a custom ChatGPT session via an API action Plugins / Actions builders chatgpt , chatgpt-plugin , actions	11	3563	November 26, 2023

How to modify schema of custom GPT action to send an image file with post request?

Endpoint signature

Important bits of the spec:

A few tips

Related topics