Discrepancy between Chat and Responses API Image Generation tool output

graciey · October 29, 2025, 6:58pm

Hi everyone,

I’m running into a big difference in image generation quality between Chat and the OpenAI Responses API (gpt-5) with ImageGeneration tool (gpt-image-1), even when using identical prompts and images.

Goal:

I’m building a workflow to generate packshot-quality images of clothing items and accessories. The input is usually a photo of a garment (either taken by a user in their closet or sourced from a website where the item is shown on a model). The desired output is a clean, e-commerce–style product photo — like what you’d see on professional retail sites.

Setup:

Model: gpt-5 via the Responses API

Tool config:

{
  "type": "image_generation",
  "model": "gpt-image-1",
  "size": "auto",
  "quality": "high",
  "output_format": "png",
  "background": "transparent",
  "moderation": "low"
}

Same prompt and same input image each time

Input image passed via content[] as

{
  "type": "input_image",
  "image_url": "<base64img>",
  "detail": "high"
}

We’re explicitly using the Responses API with the Image Generation tool to replicate the behavior of Chat as closely as possible.

Issue

When I run the prompt in Chat, the results are beautiful — sharp lighting, realistic textures, clean backgrounds, and accurate product details.

But when I run the exact same prompt and image through the API, the quality drops a lot:

Details about the garment are wrong
Lighting and shadowing are inconsistent
Image looks less professional

I need to process thousands of images, so I can’t rely on Chat manually — I really need API-level consistency that matches Chat’s quality. Interestingly, my colleague, who’s been generating a larger volume of images via Chat (with the same model and prompt), consistently gets better-quality outputs than I do. This makes me think it has something to do with personalization.

Questions:

Has anyone else noticed this difference between Chat and the Responses API results?
Are there hidden differences in how the Chat interface calls the image tool vs how the API would call the image tool (e.g., preprocessing, better/personalized automatic prompt expansion, system context)?

Below is an example (you can see the Chat result looks great, the API result added a zipper and got the collar shape wrong). Thanks for any insights!

Topic		Replies	Views
Difference in image quality between ChatGPT and API Bugs	3	178	August 31, 2025
Experiencing Different Image Quality in DALL-E 3 via ChatGPT vs. Direct API API dalle3	7	9490	May 16, 2024
DALL-E 3 API images being much worse than ChatGPT API chatgpt , dalle3	6	4357	December 17, 2023
Chatgpt image generation vs openai gpt-image-1 quality and text? API gpt-4	2	2708	May 18, 2025
Dalle 3 image generation problem, API VS Chat API chatgpt , dalle3 , dalle	1	248	February 21, 2025

Discrepancy between Chat and Responses API Image Generation tool output

Related topics