Help with `images.edit` : Mask Not Constraining Edit to Specific Area

Hello openai Community its been really long but I am at in need of help.

I’m working on a Python script for a targeted inpainting task and have run into an issue where my mask is not being respected. I’d be very grateful for any insights the community might have, especially regarding the gpt-image-1 model.

My Goal:

I’m trying to apply a custom texture (a “skin”) to specific parts of an image—in this case, applying a new paint job to the pieces of an F-16 model kit.

Input:

My Approach:

My workflow is a two-step process using the gpt-image-1 model for both calls:

  1. Generate a Mask: I use client.images.edit with gpt-image-1 to create a black and white mask from my source image, isolating the F-16 parts in white.
  2. Process and Apply Mask: I use the Pillow library to convert the mask into a proper RGBA PNG with an alpha channel. Then, I call client.images.edit again with gpt-image-1, providing the original image, the RGBA mask, and a prompt to apply the new skin.

The Problem:

The mask generation (Step 1) and the alpha channel conversion (Step 2) seem to be working perfectly. However, in the final step, the gpt-image-1 model doesn’t seem to be constraining the edit to the unmasked area. The final image gets altered globally, almost as if the mask input is being ignored.

The Code:

Here is the exact code I’m running.

#
# SETUP
#
# %pip install pillow openai -U

import base64
import os
from openai import OpenAI
from PIL import Image
from io import BytesIO
# from IPython.display import Image as IPImage, display # For notebooks

client = OpenAI(api_key="API_KEY")

# --- Define Image Paths ---
os.makedirs("imgs", exist_ok=True)
img_path1 = "imgs/f16_kit.jpeg" # My source image
img_path_mask = "imgs/mask.png"
img_path_mask_alpha = "imgs/mask_alpha.png"
img_path_mask_edit = "imgs/final_edit.png"

# --- Step 1: Generate the Mask using gpt-image-1 ---
print("Generating the mask...")
prompt_mask = "Generate a mask of all the F-16 pieces. Make the pieces solid white and the background solid black. Return an image in the same size as the input image."

try:
    with open(img_path1, "rb") as img_input:
        result_mask = client.images.edit(
            model="gpt-image-1", # Using the specified model
            image=img_input,
            prompt=prompt_mask
        )
    
    # Process and save the result
    image_base64 = result_mask.data[0].b64_json
    image_bytes = base64.b64decode(image_base64)
    image = Image.open(BytesIO(image_bytes))
    image.save(img_path_mask, format="PNG")
    print(f"Mask successfully generated and saved to {img_path_mask}")

except Exception as e:
    print(f"An error occurred during mask generation: {e}")
    exit()

# --- Step 2: Create an Alpha Channel for the Mask ---
print("Converting mask to have an alpha channel...")
try:
    mask = Image.open(img_path_mask).convert("L")
    mask_rgba = mask.convert("RGBA")
    mask_rgba.putalpha(mask)
    mask_rgba.save(img_path_mask_alpha, "PNG")
    print(f"Mask with alpha channel saved to {img_path_mask_alpha}")

except Exception as e:
    print(f"An error occurred during alpha channel creation: {e}")
    exit()

# --- Step 3: Edit the Original Image Using the Alpha Mask with gpt-image-1 ---
print("Attempting to edit the original image with the mask...")
prompt_mask_edit = "Edit the mask generated in White and paint only that in the paint that such as generated from a real 3d Artist this skin should be perfectly mapped on the white spaces ONLY The theme is A cool really cool black assassin F-16 skin"

try:
    with open(img_path1, "rb") as img_input, open(img_path_mask_alpha, "rb") as mask_input:
        result_mask_edit = client.images.edit(
            model="gpt-image-1", # Using the specified model again
            prompt=prompt_mask_edit,
            image=img_input,
            mask=mask_input,
            size="1024x1024"
        )
    
    # Process and save the final result
    image_base64_final = result_mask_edit.data[0].b64_json
    image_bytes_final = base64.b64decode(image_base64_final)
    final_image = Image.open(BytesIO(image_bytes_final))
    final_image.save(img_path_mask_edit, format="PNG")
    print(f"Final image generated and saved to {img_path_mask_edit}")

except Exception as e:
    print(f"An error occurred during the final edit: {e}")

Is my final prompt too complex or confusing for the model, causing it to disregard the mask’s constraints? Should I be simplifying it drastically when a mask is provided?

Any help or insight into the behavior of gpt-image-1 in this context would be incredibly valuable. Thank you for taking the time to look at this!

Current Output: