Dall-E-2 Image Edit ISSUE

This is my code where I am trying to Edit image using Dall-e-2 using single image and prompt, I am providing transparency to the image as said in the API Section but for some reason in the output url I receive the same image that i Uploaded, I need to edit images withut providing any external mask, since Dall-e-3 has no image implantation thing going right now I have to stick with 2. Can anyone help me out?

def generate_image_using_image_and_prompt(prompt, image_path):
try:
image = Image.open(image_path)
image_rgba = image.convert(‘RGBA’)

Create a new image with transparency

    transparent_image = Image.new('RGBA', image_rgba.size, (255, 255, 255, 80))

    # Composite the original image onto the transparent image
    # transparent_image.paste(image_rgba, (0, 0), transparent_image)
    transparent_image.paste(transparent_image, (0, 0),image_rgba)

    # Save the transparent image
    transparent_image_path = '/home/reckonsys/CHATBOTS/reckonsys-ai/ImageGenerator/transparent_image.png'
    transparent_image.save(transparent_image_path)

    # Convert the transparent image to bytes
    image_bytes = io.BytesIO()
    transparent_image.save(image_bytes, format='PNG')
    image_bytes.seek(0)
    
    image_file_path = '/home/reckonsys/CHATBOTS/reckonsys-ai/ImageGenerator/blank.png'
    with open(image_file_path, 'wb') as f:
        f.write(image_bytes.getvalue())
            
    response = openai.Image.create_edit(
    model="dall-e-2",
    prompt=prompt,
    image=image_bytes,
    size="512x512",
    # quality="standard",
    # style="natural",
    n=1,
    )

    image_url = response.data
    print(image_url)
    return image_url

You don’t want the whole image transparent, just the bits/section that you want the model to replace.

Might be easiest to install the library and go that route if you can. Images can be a bit tricky… Can you post an example of your transparent (mask) and other image? IIRC, you have to send both…

Okay… actually reading your post closer… looks like you’re just pasting the image onto the transparent one? You need 2 images, one will have a “transparent hole” or missing section that the API will replace but it will have the rest of the image not being replaced if that makes sense…

1 Like

In a case where it says, That the only required field is Image (if mask is not provided then image should have some transparency). So, I tried uploading an image, but to dynamically have an area changed, what approach should I be using, since user might upload a park and just prompt I want the swing to be replaced by a see-saw?

The image thast the user sends should already have some transparency? Is that the case
?

1 Like

Ah, yeah, you can’t pinpoint an area like that automatically. They would need to “highlight” or “box” the area then you would save then and use it… Sorry, tech isn’t there yet…

One of the ways I used it in DALLE2 was to have a “template” for a circle or shield then the image would be displayed in just that area (circle-shaped or shield-shaped)… But yeah, you can’t dynamically grab it from natural language (yet)…

1 Like

So, in case where we do not want to upload the mask, we need to just highlight the current image and then pass it to dall-e.

And, in a case where I use both mask and image, then does mask have to be highlighted view of the similar image or it can be any image with a area highlighted and then both images will be merged??

Thank you

  • Paste image in Photoshop as a layer with no background (or just turn off background layer).
  • Pick the eraser tool
  • set to a “pencil” type brush for 100% erasure with no fade
  • start erasing, you get colors replaced with the grid of transparency
  • save as a PNG-32 with transparency (alpha channel)

That’s the type of image you can upload alone to image edit endpoint.

Or you have in your web page someone paint with a delete brush, but they are actually painting a 1-bit mask image that you convert to 32 bit RGBA (or mux the alpha layer into a 24 bit image.

It is conceivable one could use a non-OpenAI vision model with grounding that can place bounding boxes around identified objects, but those are generally not as skilled as GPT-4 vision.

1 Like

I’m also unable to get image-edit to work.

Here’s my .ipynb:

URL='https://gist.github.com/p-i-/6e717099d712d6849dffce3725c65b90'

I generate an image (green ferret) then add an alpha channel, setting the top half to 0 (transparent/edit) and the bottom half to 255 (opaque/fixed). Then I invoke the image-edit API asking it to make the ferret blue.

But what I get back is the original image. EVERY TIME. I have tried MANY variations.

This is NOT a good experience.