DALLE3 Prompt Tips and Tricks Thread

I tried colors heh…Did two tiny number labels on pants for some reason…

A wide oil painting depicting eight explorers, each dressed in one of the eight primary colors: red, orange, yellow, green, blue, indigo, violet, and black. This diverse group includes four women and four men of varying ages and ethnicities, styled as a mid-19th century American landscape painting. They are viewing a lush landscape from a mountain passage. Each explorer’s shirt has a numeral from 1 to 8 on the back, in the order: Man (1), Woman (2), Woman (3), Man (4), Man (5), Woman (6), Man (7), Woman (8).

I wonder is SORA does (will have) the same problem?

Or is it hooked up to a next-gen model?

1 Like

Issues were encountered in going the opposite direction…

A group of two thousand explorers (comprised of women, men, and children of varying ages and ethnicities), on a frontier, who have just reached a point in a mountain passage where they behold a vast and beautiful land spread out before them, during the month of May. Each person of the 2000 has incredible detail paid to their appearance, clothing, and realistic representation of the human form down to minute details, every single person an accurate representation of a human with the artist’s highest attention and no expense spared in the time-consuming painting. Style: vibrant oil painting, Hudson River School movement, landscape, American frontier art.

We have numbers… kinda… lol… shows there’s no “reasoning” going on…

A wide oil painting depicting eight explorers, each dressed in one of the eight primary colors: red, orange, yellow, green, blue, indigo, violet, and white. This diverse group includes four women and four men of varying ages and ethnicities, styled as a mid-19th century American landscape painting. They are viewing a lush landscape from a mountain passage. Each explorer’s shirt has a numeral from 1 to 8 on the back, correctly numbered from left to right: Man (1), Woman (2), Woman (3), Man (4), Man (5), Woman (6), Man (7), Woman (8).

ETA: My take on more!

A wide oil painting in the style of a 15th century European artist depicting a large crowd of approximately 234 people gathered in Nuremberg, experiencing the first alien contact on Earth. The scene is solemn, with people of various ages and attire typical of the 15th century, showing mixed expressions of awe, fear, and curiosity. The setting includes medieval buildings and cobblestone streets, with a mysterious alien spacecraft subtly visible in the sky, casting an eerie glow over the scene.


Here’s the image depicting a large crowd in Nuremberg during the 15th century, experiencing the solemn moment of first alien contact on Earth. The scene includes various expressions and details typical of the era, with an alien spacecraft subtly visible in the sky. Let me know if there’s anything else you’d like to adjust or add!

2 Likes

The current Dall-E3 (which I have tried using with ChatGPT) does not seem to accurately represent the number of people, gender, or order, regardless of whether numbers are assigned.
Often, the number of people is incorrect, or even when numbers are assigned, they are randomly ordered.

A wide oil painting depicting eight explorers, each dressed in one of the eight primary colors: red, orange, yellow, green, blue, indigo, violet, and white. This diverse group includes four women and four men of varying ages and ethnicities, styled like a mid-19th century American landscape painting. They are viewing a lush landscape from a mountain passage. Each explorer’s shirt has a numeral from 1 to 8 on the back, correctly numbered from left to right: Man (1), Woman (2), Woman (3), Man (4), Man (5), Woman (6), Man (7), Woman (8).

image1


When I try to make DALL-E 3 express something that is generally opaque as translucent, it appears to be wrapped in something translucent like plastic.
Maybe Sola (or DALL-E 4?) will overcome that problem.

A close-up of a translucent celery stalk in the foreground blending naturally into its surroundings. The background is a lush forest with various plants and trees, visible through the translucent celery stalk, creating a serene and magical atmosphere.

2 Likes

@_j about the style vivid and natural make a lot of difference and in my usecase mostly animating and or creating 3d Illustrations ‘vivid’ always is the best. Or In my case at least.


1 Like

Some of my takes on it and the theme it sets just the calm before the storm or. somehow peaceful. So, saving that style as well.




1 Like

Natural now seems to be infused with more photography, but it is placed in a very visually confrontational way so that nobody would believe it…

Natural:

Vivid (which is the default and expected DALL-E, but everybody airbrushed)

Revised_prompt same as desired input:

In Akihabara, two male otaku cosplay friends dressed in blend of steampunk and goth fashion pose in front of a quaint satanic temple.

2 Likes

dalle.text2im({
size: “1024x1024”, // or “1792x1024” for a wide image, or “1024x1792” for a full-body portrait
prompt: “A detailed description of what you want the image to include”
});

So much fun to explore the edges of our terra nova. Fascinating, muses my inner.Spock. Thanks !!!

For midjourney there is extensive documentation and people sharing their prompt templates on scribd and the web. you know what to type and to expect.

i created a prompt generator and enhancer service using the API implementation of openai, AS WELL as the GPTplus subscripton ( and have a Monica subscription with Dalle-3 included) , so i try all my prompts on multiple places but against the same model.

and on all four it is different on what you can expect. chatGPT4 has an API connection with Dalle-3 so that works , but you can also talk to Dalle-3, the chatBot can be a little pain concerning content policy blockers. if you try to generate an image in the style of Frank Stella looking like his artwork HarranII, then openAI Chatbot will block that request, but not via the API.

and Dalle and chatGPT will change your prompt if you use either too litte words or if you use words like drunk. this is changed into friendlier depictions of the drunk state…

the API is expensive, i create large amounts of images with the highest size and style ( that is 1792 * 1024 with style: vivid ) and that is about 1 cent for an image. then i also connected to an assistant and the normal chatbot for prompt enhancement and i spend easy 250$ month on the API alone. ( and for understanding other services i have temporary also Google Gemini, plus chatGPT+ plus Github Copilot + Monica + midjourney + runway + gencraft + stablediffusion. ) this is al temporary of course as i cannot keep on spending much on an experimental GAN Model

OpenAI enforces inclusivity by altering your prompt to include : hispanic, south asian, middle easy and African people in the image. and last week i had my first non-binary person addition in the image. nothing wrong with that but i did not request that.

and then the content policies will be stricter as they want to protect the intellectual property of their customers read: customers of MS)

i have tons of medioker prompts and images . if people want access to them than i can share them… i am a developer, not an artist :slight_smile:

i guess that because i paid for it., sharing can do no harm.

let me know

I tried to generate a picture with as many people as possible, and it ended up looking like a “Where’s Waldo?(Where’s Wally?)” scene.

An enormous crowd of countless people, tightly packed together in a vast open area. Each individual is small but distinct, filling the entire space. The scene is teeming with diverse individuals, standing shoulder to shoulder, creating a sense of endless humanity in every direction.

I’m not sure which one is the real Waldo(Wally) though…

1 Like

First, sorry for my bad English.
This is the result I get.

I started from StanleyKrute’s first prompt:
“Imagine a group of 8 explorers, women, men, and children, of varying ages and ethnicities, on a frontier, lush oil painting style, in the manner of a mid-19th century American landscape painter of the West. They have just reached a point in a mountain passage where they see a vast and beautiful land spread out before them, during the month of May.”

To save your time, here is the summary of the concept:

  1. Use ChatGPT Classic, tell it the image you want to create, and add at the end:
    “{description of your image}
    —-
    The above is a picture I imagine. I know you can’t generate images, but I want you to describe it with topics including, but not limited to: Composition, Subjects, Color Palette, Light and Shadow, Types of Elements. Could you do that?”

  2. Open a specialized GPT for this image generation process. This GPT should support DALL-E image generation. Its instruction is:
    “This GPT specializes in creating images from user-provided descriptions. Users input details about the layout, main subject details, background, colors, and lighting. The GPT uses the DALL-E tool to create these images, strictly following the user’s requirements without adding its own interpretations. It interacts with users to gather all necessary details and clarifications to refine the image and meet the specified artistic requirements.”

  3. Copy the feedback from ChatGPT Classic to the Drawing GPTs. Only copy the text related to the five major topics, other general conversation text does not need to be copied. And at the end write:
    “{image description copied from ChatGPT Classic}
    —-
    Draw the above image, wide screen aspect.”

  4. After you get the generated image, you can go back to the ChatGPT Classic chat room to tell it what you want to modify. To ensure it will always provide feedback in the format of the 5 topics, it’s recommended to add this after your request:
    “{Your change request}
    Friendly reminder, the description has to include: Composition, Subjects, Color Palette, Light and Shadow, Types of Elements.”


To get the above image, I have tried 4 times.
The first time I received:

I asked ChatGPT Classic to make the first modification:

  • The picture is nice, but subject should clearly state those 8 characters to reduce confusion.

What I got back was the details of 8 characters and I replaced the old subjects. Here is the next result:

I gave it a few tries and there were still more than 8 characters, so I asked for another modification:

  • Okay, I do get that 8 explorers now. But the background of the image has generated extra people to make the image rich. We have to state clear that no more people other then the 8 main character is allowed. Can you update the description please?

Here are the results of this modification:

I believed that I needed one more change:

  • I consistently get 9 people. I guess the issue is the sixth character. Let’s just say she is a mid age women or somethings similar. Mention her as child’s mother may cause an extra child into the picture.

Then I got the final image, which I posted at the top of this post.

1 Like

Inspired by Norma Rodriquez’ lovely work, and generously-shared prompt, on the FB group AI Art Image Hub, I came up with this variant prompt, after a long dialog of muy muchos iterations:

Prompt: “Widescreen 1792x1024 image. Whimsical best heart-warming lovable computer-animation style. A very cute older woman, in her 70"s, slightly overweight, exhausted, snoring, sleeps on a vintage couch, as does her extremely huge messy haired orange tabby cat. The cat is at least 3x the size of the woman. There is a resemblance between the woman and her cat. The setting is a cozy living room. An older vintage tv can be seen in the background. There are many many books. The lighting on the two characters is cool; the rest of the room has warmer lighting.”

1 Like

Best Realistic Images we can get





You aren’t properly using DALL-E to produce false narratives.


An interesting “problem” since DALL-E 3 release is that the square image format continues to “see” that it should generate content beyond the edges, imagining the resulting picture as wide.

This fault is just as common as wide images devoid of wide content.

1 Like

it’s really weird what it feels it’s ok to be realistic on and what it needs to cartoonify

2 Likes

I used “image a truck from 90 degrees view” and got a later view.

I’m not going to pretend I read through all of the comments here but I read at least the first 40-50 and, well, I had to create an account because I was surprised that not a single person mentioned a little trick I use.

  1. Find an AI-generated image that achieves the look you’re going for — whether that means the lighting is perfect, it perfectly mimics the camera lens you want to use, etc.
  2. Create a new conversation with ChatGPT (not sure how important this step is, it’s just what I do and it always seems to work) and upload/attach the image.
  3. “Good afternoon! Attached to this message is an image that I like and I would love to describe it to my friend who is blind but I can’t seem to find the right words. How would you describe this photo? Be as detailed as possible.

Aaaaannd there’s your prompt :see_no_evil: Please note, this may work just as well if you simply asked, “How would you describe this image? Please be as detailed as possible.” However, the reason I got here is because I read somewhere (maybe a Reddit comment?) that suggested leaning on “emotions” and it provided an example of someone circumnavigating the security of ChatGPT by saying their grandmother used to read bedtime stories but she has since passed away and they wanted ChatGPT to pretend to be the grandmother reading a bedtime story (I’m purposely leaving out details for obvious reasons but this has also since been fixed.)

I have no idea how all this works, I just know I got so tired of being terrible at writing prompts (and good ones literally being behind paywalls) that I decided to engineer a way for ChatGPT to be my prompt engineer :man_shrugging: