Using image URL in images/edits request

Yeah, images can be tricky. Hoping you get it sorted.

me and @raymonddavey are almost positive there’s an issue on open AI’s end. either the endpoint is broken, or something vital is missing from the docs. we tinkered and tinkered last night, but nothing worked in the end. hopefully we hear something from the devs about it before too long

1 Like

You can specify the response_format as ‘b64_json’ to get the generated image returned as base64 data

  const response = await openai.createImage({
    prompt: prompt,
    n: 1,
    size: "256x256",
    response_format: "b64_json",
  });

  const base64 = response.data.data[0].b64_json;

This base64 data can then be used to create a buffer which can be passed into the API request

  const buffer = Buffer.from(base64, "base64");
  // Set a `name` that ends with .png so that the API knows it's a PNG image
  buffer.name = "image.png";

  const response = await openai.createImageVariation(buffer, 1, "256x256", "b64_json");

  base64Data = response.data.data[0].b64_json;

Full code to create a new image and a variation of the generated image:

  const response = await openai.createImage({
    prompt: "digital artwork of a robotic lion wearing a crown",
    n: 1,
    size: "256x256",
    response_format: "b64_json",
  });

const image = response.data.data[0].b64_json;

// This is the Buffer object that contains the generated image base64
const buffer = Buffer.from(image, "base64");
// Set a `name` that ends with .png so that the API knows it's a PNG image
buffer.name = "image.png";

const response = await openai.createImageVariation(buffer, 1, "256x256", "b64_json");

const variation = response.data.data[0].b64_json;

Hope this helps :slight_smile:

5 Likes

I wish more people here would reply like you @swiftrees with “working code” (and API responses and error messages) instead of opinions and theories without working code and detailed error messages from the API.

Well done!

Please stick around and keep posting working solutions in code!

:+1:

4 Likes

Thanks for sharing, I’ve tried using your strategy but I get an error requesting the png be rgba as opposed to rgb. I can’t quite figure out how to add the alpha channel.

edit: I see now openai.createImageEdit() requires the image to have transparency if a mask is not provided.

Based on all of these experiments, does anyone know if I can pass in a base64 url as an image parameter for edits or variations?
Unfortunately I’m limited to not being able to upload files directly in the fetch request, so I only can use either URLs or Base64 strings.

Any help would be appreciated.
Thank you!

ok so iv managed to get my one up and running by creating a mask exactly the dame size as th original image and with transparency in the mask in the area you want the most modification to take place

here is what’s in my payload

in the payload everything seems perfect, here.:

Content-Disposition: form-data; name=“prompt”
make the windows of the door in the image bright yellow

Content-Disposition: form-data; name=“n”
1

Content-Disposition: form-data; name=“size”
256x256

Content-Disposition: form-data; name=“image”; filename=“fauvism.png”
Content-Type: image/png
(i can only upload one image as a new user but this image is the same as the mask but complete without the transparency)

Content-Disposition: form-data; name=“mask”; filename=“mask.png”
Content-Type: image/png

This is a great thread, ty community!

Working with images is my next goal for my personal project. Recently I was able to integrate google image search to allow response to have embed images into the responses, and now I am looking to try also utilize google’s Vision api to feed back into the workflow.

I figure the actual coding for handling images is modular enough that my hope is I can find it not too difficult to use any number of APIs along side future modifications.

1 Like

How to convert ImageData into b64 or blob? I am getting an error “‘Missing image file in request. (Perhaps you specified image in the wrong format?)’” while send ImageData as image to ImageEdit. As, I am sending FormData() therefore I think that error is because I am not sending a file.

I’d try saving it as a PNG locally then sending that…or use the OpenAI library which makes it a lot easier…

Hi! Could you shed some light on how to send a local image or which OpenAI library to use for sending an image as an input? Thank you!

My previous solution no longer seems to work with the updated Node JS library.

After updating the openai library (v 4.16.1), attempting to pass a buffer to the edit / extend endpoints wasn’t working at all, I wasn’t even getting any response / error.

In addition to my previous post, to use base64 data / a buffer, you would now need to write a file from the buffer and then create a read stream from the file.

  const buffer = Buffer.from(image, "base64");

  const imagePath = `./uploads/image.png`;
  await fs.promises.writeFile(imagePath, buffer);

  const imageStream = fs.createReadStream(imagePath);

  const response = await openai.images.edit({
      image: imageStream,
      prompt,
      mask: maskStream,
      n: number,
      size: Size[quality],
      response_format: "b64_json",
      user,
    });
1 Like

I am trying to get DALLE to spruce up a map I create, using the function openai.images.edit function. I am calling openai from a supabase edge function, so it’s deno, which is a bit different from node. After days of fiddling, I finally managed to not have to save my file anywhere. But I take it from the canvas like so:

canvasRef.current?.toDataURL('image/png');

I then have this function

const b64toBlob = (b64Data:string, contentType='', sliceSize=512) => {
  const byteCharacters = atob(b64Data);
  const byteArrays = [];

  for (let offset = 0; offset < byteCharacters.length; offset += sliceSize) {
    const slice = byteCharacters.slice(offset, offset + sliceSize);

    const byteNumbers = new Array(slice.length);
    for (let i = 0; i < slice.length; i++) {
      byteNumbers[i] = slice.charCodeAt(i);
    }

    const byteArray = new Uint8Array(byteNumbers);
    byteArrays.push(byteArray);
  }

  const blob = new Blob(byteArrays, {type: contentType});
  return blob;
}

And I use it like this

const base64Image = b64.split(',')[1]; // remove data URL scheme
const blob = b64toBlob(base64Image, 'image/png');
const imageUpload = await toFile(blob, 'image.png')

toFile is exported from the OpenAI package.
In my case I then try doing like this:

const result = await openai.images.edit({
    image: imageUpload,
    mask: imageUpload,
    prompt,
  });

I have tried with and without the mask, but I just get my original image back.
Anyway, I just wanted to mention the toFile function as that seems rather useful :slight_smile:

1 Like

I don’t know if it is of many help because this is python but:

from datetime import datetime
from openai import OpenAI, BadRequestError

import requests
import time
from PIL import Image as pil

# Opening OpenAI client
client = OpenAI()
# defaults to getting the key using os.environ.get("OPENAI_API_KEY")
# if you saved the key under a different environment variable name, you can do something like:
# client = OpenAI(
#   api_key=os.environ.get("CUSTOM_ENV_NAME"),
# )
# or use it in raw coding like:
# client = OpenAI(
#   api_key="<Your API key here>",
# )
# Just remember that putting your API key directly into the code is dangerous if you are going
# to show your code around, many article, YouTube videos and tutorials teach you how to define
# an environment variable if needed.

# Setting a loop, so the user can generate images indefinitely
while True:

 try:
	response = client.images.generate(
                model="dall-e-3",
                prompt=prompt,
                n=1,
                size="1024x1024",
                quality="standard"
            )  # function to generate the image. The response is a JSON file with the url of
            # the generated image
		time.sleep(50)  # Wait for 50 seconds, enough time for the image to be generated
            # defining datetime for creating file names
            now = datetime.now()
            for img in response.data:  # response.data is a list of JSON objects with the urls of each generated image.
                file_name = f'image_{now.strftime("%Y%m%d_%H%M%S")}.png'  # using datetime to create the files names.
                with open(file_name, 'wb') as f:
                    f.write(requests.get(img.url).content)
                    f.close()  # using the requests libraries to write the image from the url in a file,
                pil.open(file_name).show()  # then, showing the file when it's all done with Pillow.
        except BadRequestError:
            print("\nAn error has occurred, try again:")

Basically this program asks for a prompt, and then it uses the prompt in the API requisition. It waits for some time just to be sure the AI made the image, and then, uses the datetime library to create a unique name for your file. It uses open (Python’s built-in function to read files) in the write-bytes mode to save the image that was get by the requests library in the line f.write(requests.get(img.url).content) is using the method get from this library to take the image from the url provided in the API response and using its content to write it to a file in your computer. At lasts it uses the Pillow library to open the image as soon it is saved automatically