The current API is unusable for all cases apart from “Get a somewhat random, weird image that might look like your prompt.”. I strongly recommend that until this is fixed all developers steer far clear. Taking all the control away from the user is a bizarre choice by OpenAI.
I’m not sure if other features or bits of the API do similar non-user friendly things, but currently if you try to use DALL-E 3 on the API, it rewrites the prompt automatically and without user control. This happens even if you include specific instructions not to do so. There is no flag to turn it off. It totally ruins consistency of the images (needed for businesses).
Also, I recently ‘upgraded’ to Teams to get more messages, and the Export Data feature was no longer available. Apparently you aren’t able to export your data on a Teams account and the ‘upgrade’ from plus to Teams is irreversible. So now I can’t access my data
These are two times I have chosen to invest in the OpenAI platform (image credits and more messages with teams) and both times have had features taken away and access limited or controlled. What’s more, they haven’t given any indication any of this will change. When I look over the threads discussing the problem. It’s total silence since a year ago. Maybe they plan to abandon these products and so are completely ignoring them?
I’m curious, is this reflected across other API features and OpenAI products, or is it just Teams and DALL-E 3 that punish you for investing in them? What’s the advantage for OpenAI to harm the users who want to use the product for their business?
Hoping for changes! nothing personal OpenAI team, I am sure this is organizational not individual as a problem…
I wonder if other people relate and if this is a company decision/intentional.
A human will rate a less accurate model as more reliable if it is more legible, because the more accurate model’s output is too difficult to interpret.
This could explain why the LMSYS arena has such weak models at the top of the list. Legibility/interpretability.
https://arena.lmsys.org/ (it’s a stupid gradio SPA, you need to click on leaderboard at the top)
And sama does seem to value the arena:
we try not to get too excited about any one eval, but excited to see GPT-4o mini so close to GPT-4o performance on lmsys at 1/20th the price! https://t.co/5ynjPw29Ls
So if OpenAI uses that, and their remote task worker (RTW) consultants’ comprehension as target KPIs, their product strategy from the past year makes a lot of sense: “We have a premier model, but it’s not palatable to vocal users and RTWs - so instead of building stronger, smarter models, we’re gonna build crutches.”
The prompt rewriting thing is a helpful crutch for a new user.
Look at the new JSON thing. It’s a crutch for a crutch. Absolutely unnecessary waste of time and tokens.
Don’t get me started on assistants
I no longer think it’s bizarre. Just Altman’s (current) strategy.
I wouldn’t go that far, I’d say I think they want to maximize median usage. Your mom, your dad, entry/mid level developers who think chatgpt was “too dumb”; people who faced a barrier to entry.
We’re just not part of this segment, so we don’t get that much (or any) attention.
To make the Dall-E prompt conform more to your actual prompt without re-writing, you should look at the official OAI usage guide, and insert a counter-prompt to essentially disable the re-write.
This doesn’t work for me (as well as many other additions and attempts to get this strategy to work) as my required prompt is medium length. My job is as a prompt engineer. The tool is totally unwieldy, unlike gpt 4o. Others have said they suspect it is a specialized LLM.
To be clear I am not interested in it ‘conforming more’ as when it doesn’t conform even a little it does totally bizarre and horrible changes, and often it does many of these weird changes even with prompt engineering. All non conformation of any type (and there are plenty of bizarre types) are totally counter to what I want it to do.
Ideally this rewriter LLM that’s being used to change my prompt isnt improved, it’s scrapped or used just when it detects unsafety. I just want to use the prompt as written, like every other image tool
I here what you are saying, and had similar frustrations of getting prompts rejected for content violation (for benign things) or seeing the revised_prompt in the returned JSON to be totally dissimilar to the input JSON prompt.
Unfortunately, the only workaround to get:
prompt\approxrevised_prompt
Is via prompting the model.
I hope OpenAI hears these complaints, and has an expert mode or something, in future revisions, that doesn’t alter the prompt in any way.
But background on prompt alterations … back in the day, you would request an image and get totally lackluster results. The only way to get stellar images was to either know all the buzzwords, or have another AI model trained on the same input description data essentially translate your prompt into a richer prompt. Here it looks like there are efforts to create the richer prompts internally, but this leads to imprecision if you are already familiar with the model lingo.
Having revised prompts isn’t a terrible idea, as it can teach you the model lingo. But once you become an expert, it’s like racing the Tour de France with training wheels
I actually ended up being able to get it to work. Albeit one day and 4 paragraphs of prompt later. (And it isn’t simple technology! I could only get this to work because I spend all day prompting). The hardest bit was finding the middle way between getting rejected for API manipulation and convincing it to use my prompt word for word. If there are people out there looking to make images and getting it API transformed, know that there is a way… You may just need professional help
(Still want to export my data from Teams, though )