I’m trying to make a custom ChatGPT that generates some web content then put it in a .zip to be uploaded to a server I’m running. I haven’t had any luck getting ChatGPT to cooperate - it keeps leaving the file data itself out of the request and sending an empty object instead.
I’ve tried this for my request body schema in the ChatGPT action:
requestBody:
required: true
content:
application/zip:
schema:
type: string
format: binary
description: ZIP file containing website files
As well as this:
requestBody:
required: true
content:
multipart/form-data:
schema:
type: object
properties:
file:
type: string
format: binary
description: ZIP file containing website files
I am able to successfully upload example files via curl both via multipart as well as binary data in the request body, and my backend server is seeing the requests come in from ChatGPT, but is getting empty data to process.
If anyone has had any luck with this, some pointers would be appreciated. Thanks!
I’ve also been trying to do this with plane png image files and can’t get it to send anything other than an empty object
I may test this out with a test JSON type request body and if I can get ChatGPT to send that correctly, maybe workaround this by using a base64 string on the binary file in the JSON instead. But would really rather not have to do that.
I tried the base64 thing but couldn’t get it to send. Let me know if you have any luck with it. I also tried using code interpreter to save to S3 bucket but no go
So I figured out a way to do this. Caveat: I’ve only tested this for small-ish files - this may be clunky or slow for large files, or just not work at all.
Basically I got around the model’s tendency to truncate base64 strings by making an endpoint on my backend that will accept sequences of base64-encoded strings and reassemble them and save the file.
I created an action with a schema that looks like this:
/upload-part:
post:
summary: Upload a part of a file
operationId: uploadPart
description: Accepts a part of a file encoded in base64, along with its metadata.
requestBody:
required: true
content:
application/json:
schema:
type: object
properties:
part:
type: string
description: Base64 encoded part of the file.
index:
type: integer
description: Index of the current part in the sequence.
total:
type: integer
description: Total number of parts in the file.
id:
type: string
description: Unique identifier for the upload session.
filename:
type: string
description: The filename of the file that will be saved once all parts are received.
required:
- part
- index
- total
- id
- filename
responses:
'200':
description: Successfully received the file part.
content:
text/plain:
schema:
type: string
'400':
description: Invalid input received.
I’m using ‘id’ as sort of a namespace here… it can refer to a particular user, project, etc. that the assembled file should be associated with.
From here you can just craft your custom GPT with instructions on how to generate and split apart the file and use the uploadPart action to send each one in sequence.
This is going to require a lot more santiy checking and error handling than what I’ve put together so far, but it at least works as a proof of concept.
2 Likes
have you had any success in uploading a 512x512 image in base64 with this splitting technique ? or is this already stretching the enveloppe ?
I attempted this, and had it working. The issue is the GPT starts getting confused with keeping track of the chunks. I did use it to submit files that were beyond the capacity of a single request, but even when I got to like 6kb files it started corrupting the base64 chunks.
That could be something that can be dealt with via instructions, potentially, but nothing I tried worked. Larger files tend to crash the code analysis, then it tries again, and has a habit of just yoloing random sequences of like YuYuYuYuYuYu for all chunks beyond the first 2 or 3.
1 Like
OMG, so it is not quite realistic to expect that we will be able to pass pictures in base64 to our APIs… I wonder whether the workaround is in the conversation , i.e. click here to upload your picture (to our server) and then the output is an id that people can refer to when they want to see the output of the API transformation
Hmm, like the GPT gives a link to an external webpage where you upload your image? No reason why that wouldn’t work. At that point what’s the GPT doing for you though? It likely can’t receive the file via your API either, due to limits to the response size (I tried this with a 97kb file).
I really think we just need to hope that they increase the limits at some point, and accept small file limitations for now… much to my chagrin.
I use another model to apply advanced transformation to my picture… And I use chatgpt as the ux. Am I reading you correctly: even if that works, I won’t be able to send the picture back to chatgpt? Due to size limit?
I only tested it with the one 97kb image, so it might work with smaller ones, but yes you essentially run into the same issue as file uploading, but you can receive more in a response than you can send in a request. Not enough though, really. I guess it’d have to store the entire image in its context window, which is probably why there’s a limit.
Got it, so basically, will bypass all of that…with passing links to the user
Many thanks