How can I use Create image edit API?

I am trying Create image edit API [1] with Python code below, but the image_url returned always be the same with the input image of “otter.png”, I tried many times but never get a expected image of “A cartoon baby sea otter wearing a hat”.
#####################
response = openai.Image.create_edit( image=open(“otter.png”, “rb”), mask=open(“mask.png”, “rb”), prompt=“A cartoon baby sea otter wearing a hat”,n=2, size=“512x512”)
image_url = response[‘data’][0][‘url’]
#####################

I attached the otter.png and mask.png. Can you please help to figure out my mistake?

[1] OpenAI API

Sounds like it’s not getting the mask correctly. Here’s some code from GitHub that might help you…


    def generate_from_masked_image(self, prompt, image_path):
        with open(image_path, "rb") as f:
            image_base64 = base64.b64encode(f.read())

Thank you very much for your reply. It worked when I use a 32bit mask png instead of 24bit.

I created the 32bit mask png via GIMP, the 24 bit png which does not work was created by mspaint.exe.

2 Likes

Thanks for coming back to share. Hopefully it helps someone in the future.

Good luck!

1 Like

Mike, I’m new to Gimp… got it running but now going down a rabbit hole on the steps to create a layer mask… Finding many different tutorials. Might there be a step by step guide to creating a mask for tailored for this Image Edits API use case?

24 bit = RGB = colors only
32 bit = RGBA = colors plus alpha channel (transparency level)
lossless = PNG

Here’s a gimp overview: 2.3. Saving Images with Transparency
and detail: Second Life Forums Archive - Tutorial: Making a transparency mask using GIMP

You’ll want to make a fully transparent area, pixel perfect, and not fade the transparency with a paint brush tool. You are drawing on the A channel, which for AI edits can only be “yes” or “no”.

Preserving the actual image is not important, you can paint it a solid color to make the mask file more compressible.

Thanks for the GIMP instruction links, they were very useful. Using them I was able to transform an image created by ‘image/generations’ into a transparent PNG. Pretty neat.


I’m guessing the process to create a mask for use with the ‘images/edits’ API is similar.

Thanks again for the help on this. Creating a mask file in GIMP made the difference for the call of v1/images/edits from python. white_cat_with_hat

Define Request Body

    'image': ('white_cat_rgba.png', open(filename, 'rb')),
    'mask': ('white_cat_rgba_hat.png', open(maskfilename, 'rb')),
    'prompt': (None, 'A white cat wearing a london police hat'),
    'size': (None, '512x512'),
    'n': (None, '1'),  # Number of variations to generate
}

I wonder if the AI will appreciate that it has still left a strange background the same as the mask, and fill it in by more prompt, if you also have language about infilling the background scene from the ediges.

So, the prompt was changed to “A white cat wearing a London police officer hat with the background naturally extending from the existing edges of the image” and ran it twice.
white_cat_with_hat_extended_background

Wanted to see if the AI would interpret this as direction to change the background. Both executions returned variations of the police cap but no change to the background. Guess I’m not sure why the AI would change the background when the transparent area of the mask is the space above the cats head where the cap is drawn.

Took a closer look at ‘Cat in Hat’ and can see what, maybe, caught your eye regarding the strange background. Callout-1 shows the area that I created using GIMP. Callout-2 is something that was positioned in the background of the original image. Callout-3 shows shows an area which, to my eye, looks like the AI used Callout-2 and stretched it to fill in behind the hat and the cats head.


This is my first time playing around with these APIs, so there’s lots to learn. One observation is that the prompt which is supplied to ‘images/edits’ is highly inter-related to the image being edited and the image mask.