Has anyone succeeded creating the same behaviour as the Chatgpt prompting for Dall-e 3?
In chatgpt it memorises the context of the chat and previous generations to further build your image and you can ask to alter the image/prompt, but the api docs just refer to writing the text yourself.
Now I know I can write a script asking Gpt 4/3 for prompt and the injecting them as text, but sometimes we all know Gpt fails or does random behaviour that might create errors, and I feel like or believe this does not equate to good integration, I am assuming, ChatGPT is optimised to do the prompting and to memorise, it works fine, I can try to get similar results, but it is difficult to tell if you do not fully understand the backend of ChatGPT to replicate this behaviour in the API, unless I am missing something.
Does this just mean I need a model with Dall-e’s constraints as default output? is the fact that GPT alters the text a tiny bit enough? Which model is important for memorising chat context Dalle or GPT? How do I replicate this middle ground between the two models?
That is not what I mean I know what seed is is it the context_ID thing that makes series of images similar?
I am not referring to making the images based on the same prompt different by changing seed, I mean keeping the context of the previous images already generated.
I have to ask like this. Because there are different methods in the context you need. Including the methods you need to use. Just like me before I created the picture. I’m just going to take that prompt and create an AI that can create images based on the prompt. The resulting image and subsequent edits will be easier to control.
ChatGPT has a system message directive to rewrite DALL-E prompts for you, including diversity, avoiding living artists and copyrighted periods, etc.
The API has the same, but there is no chat history distracting the AI from its mission; the AI prompt rewriter only has one job. The content filtering is so profound that the verbatim OpenAI instructions are themselves rejected (for their mentions of races and violence, likely not anticipating my direct API jailbreaks) and one has to use word-limiting techniques to get it slightly summarized and rewritten for reveal. The AI’s instruction also “enhances”, turning API parameters into words and brevity into hallucinations.
Knowing all the prompt text isn’t useful to answering this topic.
ChatGPT has no “subsequent requests easier to control”, it has no seed or changing seed now exposed, etc. The only thing chatGPT can do is submit a previous image generation’s “ID” if the user discusses an earlier image, along with its new prompt to its version of DALL-E 3. How an ID informs the new process is unacknowledged and one must speculate, whether it does some seed reuse or common prompting or even similar embedding override.
Yes this is the functionality I am referring to that I noticed, which I wondered how to do in the API.
I do not agree that prompting for race does not work, in fact when I started getting issues with this because a series of my prompts involved street/techwear outfits it started to create middle eastern people and the women wore hijabs in a lot of these images, which I did not like and looked horrendous, like sometimes it was doable and looked ok, but I think the whole diversity thing really can ruin some images because the context might be inappropriate for the diversity and just… You know what I mean it’s not what I asked for, but prompting it out did not work, until I specifically asked to do white, asian, brazilian, etc.then they disappeared.
Is this regular GPT or a custom build? You mean I can just chat with Dalle? That is support a conversation and is not just input text/output image, because the code, builds, docs and github repositories I found only seemed to have that functionality.
Is image generation possible with regular GPT request through the API? Is that what you did there?
I’m referring to the triggering that prevents me using “not bug bounty” techniques in a straightforward manner to break AI brains and extract programming language back out of an AI that makes images.
You make an AI retype what OpenAI wrote about lots of races and violence just for diversity and safety, the word combo of all those ethnicities alone is flag-worthy.
That’s one degree beyond getting the AI to repeat the image prompt exactly, where DALL-E developers actually give language to jailbreak with. The techniques that allow me to discuss how it actually works.
I used Chat Completions API then call Dall-E 3 API through function calling. The Chat holds the context. Dall-E 3 returns the prompt it used if there is revision, etc. Check revised_prompt from the return value if you want to use it as reference.
I’m building something similar → a design tool which mimics chat-gpt4 interface i.e users can chat and create designs, edit them as they like as they go on chatting.
For example
User response 1: “create an image of a boy playing on beach”
System response 1 :
User response 2: Make the boy look older, but maintain the same design style
System response 2:
For this, how should I go about building this?
Right now i’m just opening up a text box for users and calling DALLE3 API. To enable a chat functionality, will I have to use the revised_prompt of the first image, call GPT to construct a new prompt from {user response 2}, and then submit this prompt to DALLE3 API again?
I’m building a product which will design visuals but will mimic the chatGPT-4 interface. I want the user to be able to chat and visualise the designs.
I have integrated with DALLE 3 API, but it is a single shot response i.e it only visualizes the submitted prompt. There is no context of chat history. I’m able to do this with the chatGPT interface (screenshot attached) but I’m not able to re-create this on my own product.
Essentially -
Step 1 → User response 1 : “design a visual of a boy playing on a beach”
Step 2 → I then hit the DALLE 3 API with this prompt to return the desired image.
Step 3-> User response 2 : “remove the sand castles, maintain the same style”
— how do i give a new prompt to DALLE 3 API such that it can create a new visual from above chat history?
Step 4 [DESIRED] → System generates a similar image without sand castles.
1. user: create an image of a boy playing on beach
tool call: CreateImageUsingDallE3({ prompt })
tool output: { image, revised_prompt }
call chat completions and ensure that revised_prompt is included in the response
2. user: make the body older, but maintain the same design style
tool call: CreateImageUsingDallE3({ prompt (should now contain updated version of revised_prompt) })
tool output: { image, revised_prompt }
call chat completions again...
what is important is having the prompt used in image creation in the context so that the AI can build over it.
I join. I have the same problem. I’m trying with very little success to write a program that generates bedtime stories for my son to read together in the evening. The stories are a success. So I decided to illustrate them. I can’t find a way to maintain consistency in the illustrations. The best result via API was to maintain the style. Everything else changes from one image to another. Although the protagonist uses the same descriptive prompt, there is always some difference. For a child they are still wonderful. But it’s a style exercise that I have to be able to complete or my brain will burn thinking about it.