Enter a description and two reference images to generate a new style image. Which API of OpenAI should be called for this ?
The ‘client. images. edit’ interface often reports’ 504 Gateway Timeout ’
If there is no reference image and only prompt generation is used, I called the “client. images. generate” interface and did not report “504 Gateway Timeout”; This indicates that there is no problem with the network; And strangely enough, when I start the project in my local PyCharm, whether it’s calling the “client. images. generate” or “client. images. edit” interface, there are no errors. However, when I call the “client. images. edit” interface in the online testing environment, I get a 504 error message
Is my interface calling incorrect?
For the problem of inputting text and multiple images and needing to output a new image, which API of OpenAI should I call?
Which API endpoint: image edits
Which model: gpt-image-1 (org ID verification required)
Timeout: at least 5 minutes (generation can be long)
Host: one that doesn’t timeout idle connections at 60 seconds
Status 504: a middleman is closing the connection on you (gateway/proxy)
If using the OpenAI SDK, it should have a high enough timeout by default.
ok thank you
When inputting text and multiple images, I used the “client. images. edit” interface, which is correct; Is asynchronous operation commonly used in business for online timeouts? Is the front-end constantly polling for results? Or is it about increasing the timeout threshold of the system?
The API connection is being kept open. There is no traffic unless you pay for a few partial images as progress, and that doesn’t seem to be returned during use of edits.
There is unfortunately no option to pick up your images from the photo lab later, such as with a background parameter. The idle network connection must be maintained until your response is sent with a base64 image.
The image generation endpoints are particular. They were architected for dall-e 2. With image edits using the new model, where it is actually using vision to create a new image, the time it takes is far longer than any expectation you may have, easily averaging 2-3 minutes with the new input_fidelity parameter.
That means that many service worker hosts or cloud providers have network timeouts that will cut off your idle connection.
Chat-witn-an-AI endpoints currently offer streaming, to get a generation with constant data, and also offer a background mode - poll to see when your output is ready. Without streaming, waiting for a long monolithic generation was an issue as far back as GPT-4 in 2023.
Where are you getting 504? Time to look for a new place to run a backend.