DeepLearning AI says OpenAI rolled out Dalle-3 API access?

I just received the latest E-mail from “The Batch” newsletter published by DeepLearning AI. It had this text in it:

“Upgrades and more: The company rolled out the upgraded GPT-4 Turbo (which now underpins ChatGPT). It extended API access to its DALL·E 3 image generator, text-to-speech engine, speech recognition, and agent-style capabilities. And it showed off a new concept in chatbots called GPTs.”

Does that mean there’s a route to us devs getting API access to Dalle-3 now? If so please leave a doc link. I’m eagerly awaiting such access!

1 Like

Yup, it’s now available via API.

I’ve hooked it up here…

I’m getting a bit better quality consistently tonight after changing DALLE2 prompts to DALLE3-friendly prompts. Good stuff, though.

The OpenAI DALLE3 CookBook page has some good details to get you started.

3 Likes

Thanks! This is great news. Can you outline roughly any changes you made to your interface code over what you had working with Dalle-2? Any new or changed parameters? I have a working virtual world app with an in-world chat window to display generated images in the world. I’m wondering how many hoops I’ll have to jump through to get my current code to work with the new API.

Ouch. No image variations yet. Hope they roll that out soon:

" The only API endpoint available for use with DALL·E-3 right now is Generations (/v1/images/generations). We don’t support variations or inpainting yet, though the Edits and Variations endpoints are available for use with DALL·E-2."

1 Like

There’s changes for sure. Edit and Variations aren’t supported for DALLE3 yet. You set the model to which one you want. I kept a DALLE2 version of the tools and spun-off a new DALLE3 version as there’s no small sizes in DALLE3 and other variations. The cookbook is a good rundown and the docs are mostly up to date.

It shouldn’t be too much work but some work…

Thanks. When you said:

“I’m getting a bit better quality consistently tonight after changing DALLE2 prompts to DALLE3-friendly prompts. Good stuff, though.”

Are you referring to the automatic GPT-4 assisted prompt rewriting feature you currently can’t turn off with the Dalle-3 API (as stated in that doc you linked me), or some manual process you are employing? If it’s a manual process, are you simply using intuition to do this or are there docs/links that give tips on writing Dalle-3 friendly prompts?

2 Likes

We had an OpenAI employee on the DALLE team stop by recently…

It’s still hit or miss, but it sounds like they want to give us control.

I’ve been prompting since early GAN, so I get gut feelings on prompts sometimes. Though some of the more extreme ones are giving me problems…





… some of my “styles” always add text no matter what, but I’m slowly getting the model to do it less.

If OpenAI really wants us to have as much control as possible (which makes sense), I’m sure it will improve - maybe even better seed handling. They want the tool to be as useful as it can be as much as we do, I think, but they’re erring on the side of caution when it comes to safety. (Remember Tay…)

ETA: What I’ve noticed so far is that natural language works a lot better than “prompt whispering” with “secret words and phrases” etc - just tell it what you want as detailed as possible…

1 Like

How did you manage that? Was gold bullion involved? :smile:

I’m very fine with that! Remember, I’m relaying the prompts created by non-technical users who just want to type and go.

1 Like

Yeah, it’s why I’ve broken it down to just dropdowns and click a button. Choose what style you want and the character, and get it.

No comment but GPUs are the new gold bullion! :wink:

Seriously, though, OpenAI employees pop-in now and again. I’d rather they stay mostly busy with improving tools and coming out with new stuff, but it is appreciated when they stop by to enlighten us officially.

If you run into any specific problems, come back to let us know, but it’s fairly straightforward.

1 Like

Just a follow-up on the tweaking thing. Yes, I have mixed feelings on this. For example, the Leonardo and Stable Diffusion APIs have a legion of parameters for style, engine select, on and on for their API calls. I swear there’s a market to create a chat-bot just to help devs use them!

The API “parameter swamp” appeals to me as a dev because I believe most of us that are programmers love parameter tweaking. But that’s out the window for the average user. Hell, even I got really tired after a while, especially since the WYSIWYG or “do what I mean” ratio on these features with nearly all the AI gen services I’ve used, especially the text-to-video ones is really low. It feels more like alchemy or voodoo than science.

BTW, did that OpenAI staffer talk at all about any plans for text-to-video in the near future?

Sorry if this link is already known. You can check the API of DALLE・3 here.

And the parameter format is slightly different from that via ChatGPT/PLUS.

API:
model=“dall-e-3”,
prompt=“a white siamese cat”,
size=“1024x1024”,
quality=“standard”,
n=1,

Chat GPTplus:
{
“size”:“1024x1024”,
“n”:1,
“prompt”:“a white siamese cat”,
“referenced_image_ids”: [“gen_id”]
}

You can select the quality in the API, but it is unclear whether “referenced_image_ids” can be used. ChatGPT PLUS is the opposite.

However, the current situation is extremely fluid, so it is best to check the references frequently.

2 Likes

I haven’t heard anything. I helped beta test DALLE2.experimental (which eventually became DALLE3 after our feedback this summer)… They’re likely working on stabilizing post-launch and adding to Labs, hopefully making edit and variations available for DALLE3, etc. I got to Alpha test the ChatGPT Plus launch which was kind of wild as we played around with it for a weekend…

1 Like

@hayashi.4417 @PaulBellow

Oh ratfink! You can only do one image at a time with Dalle-3!:

You can request 1 image at a time with DALL·E 3 (request more by making parallel requests) or up to 10 images at a time using DALL·E 2 with the [n parameter]

https://platform.openai.com/docs/guides/images?context=node

Yes, I see the note on making parallel requests, but that means an awkward restructuring of my code to handle this.