Image generation / edit API time out with gpt-image-1

Issue
The GPT Image 1 API times out at around 180 seconds. I am using the nodejs SDK.

What I have tried:

  • I have tried moving over to Fetch / Axios to call OpenAI’s API
  • I have also created a fake server to ensure its not a client side issue, with a fake sever, I am able to extend the timeout to 5 mins and above, so it seems to me that the issue is that OpenAI’s API does take longer to respond, but the connection can’t live that long.
  • I have also tried streaming, but it also times out
  • I have also tried to use response create API, which allows for background mode, but the image generation quality is not the same as image generate / edit
  • I have also tried to connect to gpt-image-1 through Azure, the timeout seems even shorter

Questions

  • The image edit / generation APIs don’t support background mode like the response create APIs, is there any plan to implement it?
  • Is there a reason why with the same prompt and image input, the image generation output differs significantly?
  • Are there other ways to further extend the API timeout?

Would appreciate any sort of advice. Thanks a lot!

2 Likes

First up: When using the Responses image_generation tool, it is not merely the prompt the AI sends that is used as input. The chat context of past messages plus images is sent to the image model, in an unspecified manner. This can make “input_fidelity” especially expensive, beyond that you are paying for chat model gpt-4 vision also. Avoid.

I got a timeout with edits and hifi and 3 input images at 60 seconds specified. It is ridiculously slower than any expectation.

The SDK has its own timeout parameter in milliseconds. It should be defaulting to 10 minutes. No modification = enough for images.

What you especially don’t want to allow is the built-in “retries” parameter to spend 3x your expectations, defaulting to 2.

Hosting and worker platforms can shut this network connection down though as unresponsive, usually at 60 seconds. A long-thinking o1-pro or o3-deep-research request is probably the longest-running generation you can run in non-streaming to also diagnose this, logging the time (or write your own API server that keeps a timeout=900 connection open for free, on a known-good host).

Paying for partial images in streaming can give you some traffic to keep-alive.

@_j thanks so much for the response.

I have tried multiple ways to tweak the timeout param from the SDK, I observed that the default is 10 mins, but in no way does the connection actually last more than 3 mins (180s), whether I tried it locally or on other servers. May I know where do you deploy your server? I will give it another try. I am ok to have the request run for longer as long as it eventually gives me a respond without timeout.

Same as what you mentioned, I also want to avoid retries, as it likely just time out anyways.

I have tried partial image + streaming also, it also time out.

Here’s my code for reference:

import OpenAI from 'openai';
import * as fs from 'fs';
import * as path from 'path';
import * as dotenv from 'dotenv';

dotenv.config();

async function main() {
  const openai = new OpenAI({
    apiKey: process.env.OPENAI_API_KEY,
    timeout: 900000,
  });

  // Define the input images (same as the attached file)
  const imageFiles = [
    "./scene/baby_face.jpeg",
    "./scene/scene_bedroom.jpeg",
  ];

  console.log(`start time: ${new Date().toISOString()}`);
  const startTime = new Date();

  // Same prompt as the attached file
  const prompt = `
  Put the baby inside the crib in the bedroom. Show her waking up happily.
  Create the illustration in a Disney animation style.
  `;
  
  try {
    console.log('Starting OpenAI SDK streaming image edit...');
    
    // Read both image files as buffers and create File objects
    const imageBuffer1 = fs.readFileSync(imageFiles[0]);
    const imageBuffer2 = fs.readFileSync(imageFiles[1]);
    const imageFile1 = new File([imageBuffer1], 'baby_face.jpeg', { type: 'image/jpeg' });
    const imageFile2 = new File([imageBuffer2], 'scene_bedroom.jpeg', { type: 'image/jpeg' });
    
    // Use OpenAI SDK with streaming for images.edit
    const stream = await openai.images.edit({
      model: "gpt-image-1",
      prompt: prompt,
      image: [imageFile1, imageFile2], // Use both images as per documentation
      n: 1,
      size: "1024x1024",
      quality: "high",
      stream: true, // Enable streaming
      partial_images: 2,
      input_fidelity: "high"
    }, {
      timeout: 900000,
    });

    console.log('OpenAI SDK streaming response started!');

    // Handle the streaming response
    let imageCount = 0;

      for await (const event of stream) {
        console.log('event:', event);
        if (event.type === "image_edit.partial_image") {
          console.log('Received partial image event');
          const idx = event.partial_image_index;
          const imageBase64 = event.b64_json;
          const imageBuffer = Buffer.from(imageBase64, "base64");

          const fileName = `character_openai_sdk_streaming_partial_${Date.now()}_${imageCount}.png`;
          const filePath = path.join(process.cwd(), 'generated_images', fileName);
          
          // Create directory if it doesn't exist
          const dir = path.dirname(filePath);
          if (!fs.existsSync(dir)) {
            fs.mkdirSync(dir, { recursive: true });
          }
          fs.writeFileSync(filePath, imageBuffer);
          console.log(`Image saved successfully to: ${filePath}`);
          // fs.writeFileSync(`character_openai_sdk_streaming_${Date.now()}.png`, imageBuffer);
        } else if (event.type === 'image_edit.completed') {
              console.log('Processing completed image chunk...');
              
              // Decode base64 data directly from chunk
              const imageBuffer = Buffer.from(event.b64_json, 'base64');
              
              const fileName = `character_openai_sdk_streaming_completed_${Date.now()}_${imageCount}.png`;
              const filePath = path.join(process.cwd(), 'generated_images', fileName);
              
              // Create directory if it doesn't exist
              const dir = path.dirname(filePath);
              if (!fs.existsSync(dir)) {
                fs.mkdirSync(dir, { recursive: true });
              }
              
              console.log(`Character image would be saved to: ${filePath}`);
              
              // Save the decoded image
              try {
                fs.writeFileSync(filePath, imageBuffer);
                console.log(`Image saved successfully to: ${filePath}`);
                imageCount++;
              } catch (saveError) {
                console.error('Error saving image:', saveError);
              }
      }
    }

    console.log(`Streaming completed! Total images saved: ${imageCount}`);
  } catch (error) {
    const endTime = new Date();
    const elapsed = endTime.getTime() - startTime.getTime();
    console.error('Error during OpenAI SDK streaming image edit request:', error);
    console.error(`Elapsed time before error: ${elapsed}ms`);
    throw error;
  }

  console.log(`end time: ${new Date().toISOString()}`);
  console.log(`total time: ${new Date().getTime() - startTime.getTime()}ms`);
}

main();

I would install node-js (node_modules and openai) on your local PC so you have complete control. I have a few rando file scripts there such as whisper that have not barfed on an unadvertised timeout, and I haven’t really tested running up to the limit (but js is generally not my pal.)

The package is certainly auto-spaghetti - 171 mentions of timeout. Some constants pre-multiplied by 1000, some hard-coding of library defaults to 5 minutes within…

But essentially true:

class APIClient {
    constructor({ baseURL, maxRetries = 2, timeout = 600000, // 10 minutes
    httpAgent, fetch: overridenFetch, }) {

The other way you could test the library for free would be to upload to the files endpoint, and limit the bandwidth (like I’d just tweek a managed switch..). It uses similar goofy request formation logic to edits, and we can see if it is a problem specific to sending files as form by the node.js typescript that could be dropping a prior timeout specification.

/**
 * Returns a multipart/form-data request if any part of the given request body contains a File / Blob value.
 * Otherwise returns the request as is.
 */
export const maybeMultipartFormRequestOptions = async <T = Record<string, unknown>>(
  opts: RequestOptions<T>,
): Promise<RequestOptions<T | MultipartBody>> => {
  if (!hasUploadableValue(opts.body)) return opts;

! I’ll upload the files first and use file_id—hopefully that will avoid the timeout issue."

I don’t know that you’d gain much by file_id, only supported on Responses as user message input. You’d need real low upload bandwidth.

For the benefit of already having files on OpenAI’s side, you’ve got to additionally wait for a chat model to generate its internal tool call.

At least then you can’t blame the particular form-data request method of breaking timeouts.

I actually just converted it to a python script to test again, it seems to be getting disconnected at around 180s as well:

Starting OpenAI Python SDK streaming image edit…

Error during OpenAI Python SDK streaming image edit request: Connection error.
Elapsed time before error: 182634.16ms

import os
import time
import base64
from pathlib import Path
from dotenv import load_dotenv
from openai import OpenAI

# Load environment variables
load_dotenv()

def main():
    # Initialize OpenAI client
    client = OpenAI(
        api_key=os.getenv("OPENAI_API_KEY"),
        timeout=900.0  # 15 minutes timeout
    )

    # Define the input images (same as the TypeScript version)
    image_files = [
    "./scene/baby_face.jpeg",
    "./scene/scene_bedroom.jpeg",
    ]

    print(f"start time: {time.strftime('%Y-%m-%d %H:%M:%S')}")
    start_time = time.time()

    # Same prompt as the TypeScript version
    prompt = """
    Put the baby inside the crib in the bedroom. Show her waking up happily.
  Create the illustration in a Disney animation style.
    """
    
    print('prompt:', prompt)
    print('Input images:', image_files)

    try:
        print('Starting OpenAI Python SDK streaming image edit...')
        
        # Use OpenAI SDK with streaming for images.edit
        # Open files as file objects
        with open(image_files[0], 'rb') as image1, open(image_files[1], 'rb') as image2:
            stream = client.images.edit(
                model="gpt-image-1",
                prompt=prompt,
                image=[image1, image2],  # Pass file objects
                n=1,
                size="1024x1024",
                quality="high",
                stream=True,  # Enable streaming
                partial_images=2,
                input_fidelity="high"
            )

        print('OpenAI Python SDK streaming response started!')

        # Handle the streaming response
        image_count = 0

        for event in stream:
            print('event:', event)
            
            if hasattr(event, 'type') and event.type == "image_edit.partial_image":
                print('Received partial image event')
                if hasattr(event, 'partial_image_index') and hasattr(event, 'b64_json'):
                    idx = event.partial_image_index
                    image_base64 = event.b64_json
                    image_buffer = base64.b64decode(image_base64)

                    file_name = f"character_openai_python_streaming_partial_{int(time.time() * 1000)}_{image_count}.png"
                    file_path = Path.cwd() / 'generated_images' / file_name
                    
                    # Create directory if it doesn't exist
                    file_path.parent.mkdir(parents=True, exist_ok=True)
                    
                    with open(file_path, 'wb') as f:
                        f.write(image_buffer)
                    print(f"Partial image saved successfully to: {file_path}")
                    
            elif hasattr(event, 'type') and event.type == 'image_edit.completed':
                print('Processing completed image chunk...')
                
                if hasattr(event, 'b64_json') and event.b64_json:
                    # Decode base64 data directly from chunk
                    image_buffer = base64.b64decode(event.b64_json)
                    
                    file_name = f"character_openai_python_streaming_completed_{int(time.time() * 1000)}_{image_count}.png"
                    file_path = Path.cwd() / 'generated_images' / file_name
                    
                    # Create directory if it doesn't exist
                    file_path.parent.mkdir(parents=True, exist_ok=True)
                    
                    print(f"Character image would be saved to: {file_path}")
                    
                    # Save the decoded image
                    try:
                        with open(file_path, 'wb') as f:
                            f.write(image_buffer)
                        print(f"Image saved successfully to: {file_path}")
                        image_count += 1
                    except Exception as save_error:
                        print(f'Error saving image: {save_error}')
                
                elif hasattr(event, 'url') and event.url:
                    # Handle URL if present (fallback)
                    image_url = event.url
                    file_name = f"character_openai_python_streaming_{int(time.time() * 1000)}_{image_count}.png"
                    file_path = Path.cwd() / 'generated_images' / file_name
                    
                    # Create directory if it doesn't exist
                    file_path.parent.mkdir(parents=True, exist_ok=True)
                    
                    print(f"Character image URL: {image_url}")
                    print(f"Character image would be saved to: {file_path}")
                    
                    # Download the image
                    try:
                        import requests
                        response = requests.get(image_url)
                        response.raise_for_status()
                        with open(file_path, 'wb') as f:
                            f.write(response.content)
                        print(f"Image saved successfully to: {file_path}")
                        image_count += 1
                    except Exception as download_error:
                        print(f'Error downloading image: {download_error}')

        print(f"Streaming completed! Total images saved: {image_count}")
        
    except Exception as error:
        end_time = time.time()
        elapsed = (end_time - start_time) * 1000
        print(f'Error during OpenAI Python SDK streaming image edit request: {error}')
        print(f'Elapsed time before error: {elapsed:.2f}ms')
        raise error

    print(f"end time: {time.strftime('%Y-%m-%d %H:%M:%S')}")
    print(f"total time: {(time.time() - start_time) * 1000:.2f}ms")

if __name__ == "__main__":
    main() 

The API is preposterously slow. 44 seconds with a minimal request and two small jpg.

stream = client.images.edit(
    model="gpt-image-1",
    prompt=prompt,
    image=open_files,          # pass the list directly
    size="1024x1024",
    quality="low",
    stream=True,               # Enable streaming
    # partial_images=2,
    # input_fidelity="high"
)

Crank it up to 1536x1024, high quality, 3 partials , and input_fidelity…130 seconds. So using your AI script (that I couldn’t help but tweak line-by-line), my org can get a hair faster with the only difference being the input images.

Picture; input of photos doesn't do much for 'disney style'

There were no partial events captured and I added an else to capture any uncaught events; this parameter was stealthed into the dox only, so I think it is another non-working product like VAD chunking in transcripts. I haven’t tried stream previously, as the only thing it offers is paying for what can be a free progress GIF.

Non-SDK script for you in another topic, since you have a Python environment, and you can crank the 240 second timeout I provide even higher (time.time() returns epoch seconds in Python, btw):

That script not being wrapped in a main(), if you run it in a notebook or REPL environment, you can continue inspecting the global variables.

If that fails too, dropping the network connect, but the code still runs to report errors? You’d contact the hosting service provider and tell them you need non-active open network connections to have 1000 seconds + before closing on you. Or run your own VM.