DALLE3 Prompt Tips and Tricks Thread

Thanks. Had a Discord art prompt take a crack at it too.

Interesting, though… probably the training like Jay said…

Just more proof that inpainting is sora-ley (lol) needed!

Here’s some beefcake for you.

The AI just either refuses to connect the snakes to the body after its getting the idea (and I think the initial diffusion acts as a top-down renderer), or you get even more dramatic demonstration of the need to get those human arms in there regardless, where you can almost imagine the thought process of disregarding the prompt and needing to make sense of a person:


I tried to rip their arms off and replace them with snakes, didn’t work either

Thanks, guys. Homework finished! (Collect weird snake nightmare pics…)

Just kidding.

I think Jay’s right that it’s baked in somehow…

More want clean arms/hands than snake arms, though, so like better… haha…

Found it interesting, though, and wanted to share and see if smarter minds than me could crack it.

Thanks again both of ya.

Friend had some minor success

but more luck than anything, I think… snake-arms appendages or something…

This time, our fascinating creature takes center stage in a bustling, ancient marketplace, surrounded by merchants and curious onlookers. The humanoid with serpentine arms is engaged in a delicate dance, each snake head moving rhythmically, enchanting the crowd. Garbed in flowing, colorful fabrics that hint at a rich cultural heritage, the creature’s performance bridges the gap between the mystical and the mundane. Lanterns and fabrics drape the scene, casting warm glows and vibrant shades, highlighting the fluid movements of the snake heads as they weave through the air, captivating all who watch.

In a lush, vibrant jungle, our unique creature navigates the underbrush. The humanoid, with its snake-arm appendages, reaches out to gently touch the surrounding flora, exploring its environment with a mix of human curiosity and serpentine grace. The snake heads, extending from where the arms would be, appear to be sensing the air, their tongues tasting the exotic scents of the jungle. This scene is alive with the sounds and colors of nature, and the creature blends in yet stands out, a guardian of the natural world, embodying the bridge between human and animal.

Is it.?

Trying being even more specific about the ship and the sails.

And to finish up, here’s the single ice cube, the Japanese environment, the sailboat, remixed.

Any ideas for this problem - I need a straight side view

BUT no matter how I try to set the words I get 3/4 views

any suggestions?
Also, is there a way to make sure the image won’t crop slightly from the sides like it often does with “isolated” images? (like cropping on top right image)


Thanks for sharing.

I got close-ish, but it does seem to be a problem.

I’ve reported to the DALLE team!

Even tried giving it a reference image, but it seems “baked in” somehow…

Thanks for the super fast response! Highly appreciated :blue_heart:

Straightforward prompt language, and disobedience…

I’ve noticed worse instruction-following recently with DALL-E 3, even examining the rewritten prompt being sent.

There are images that are always the same composition just from the words. No more getting four cartoon+mod art+realistic+graphic variations when you send the same thing. Prompt from the DALL-E 3 announcement:

That is still not a straight side view of the truck…I can get similar done with 7 words.

That it is not exactly what is instructed is the whole point.

Go ahead and show your work.

Here we try again, giving exactly the perspective where the truck is seen from in the instructions.

An artful image of an 18-wheeler semi tanker truck for carrying liquids is seen in this illustration, with a colorful stripe down the side of the tank of the trailer. The image of the truck is taken and seen directly from the side in first angle rectilinear view. The photographer that took this image is positioned directly alongside the trailer of the truck, viewing the broad side of the tanker trailer, so the front and the rear of the trailer are equidistant and the profile of the truck fills a rectangular box perfectly.

Specify 18 wheeler - maybe we get 18 wheels per side…

That is created with the exact prompt I construct. Asking the AI to rewrite and improve the technique with the rewriting it wants to do only makes it worse:

my point was that how much more clear or precise can I be if words “side” “straight” and direction of movement are given…

Yep, DALL-E is incredibly obtuse at following these instructions. It used to be better…

Seven words that can mean nothing else. Done wrong.

We’ll even hand the prompt over to GPT-4 for close scrutiny:

To meticulously dissect the given DALL-E prompt “rectilinear side view of a semi truck,” we’ll delve into the key components:

  1. Rectilinear: This term suggests that the shapes and lines within the image should be straight and composed of right angles. It indicates a preference for geometric clarity and the absence of curves or distortions that might be introduced by perspective or lens effects. In the context of a semi truck, this would mean the truck’s features—such as its cargo container, wheels, and cab—should be depicted with clear, straight edges and flat surfaces.
  2. Side View: This specification dictates the perspective from which the semi truck should be depicted. A side view means the image should present the truck as seen from one side, showcasing its length and height but not its width. This perspective would display one side of the truck’s wheels, the side panels of the cargo container, and the side of the cab. The side view also implies that the front and back of the truck would not be directly visible, though the front or rear might slightly project depending on the exact angle.
  3. Semi Truck: A semi truck, often referred to as a semi-trailer truck, is a large vehicle used for transporting goods. It typically consists of a towing engine, known as a tractor, and one or more semi-trailers to carry freight. The key features to depict in a semi truck include the large cargo container or trailer, the cab where the driver sits, and multiple sets of large, heavy-duty wheels. Additional details might include side mirrors, possibly the hint of a front grille, and various lines or markings on the container that indicate doors, panels, or branding.

Given these elements, the expected resulting imagery from this prompt would be a highly structured, geometrically precise representation of a semi truck viewed from the side. The truck would be elongated, showcasing its cargo-holding capacity, with a clear distinction between the tractor and the trailer components. The details would be rendered in a way that emphasizes straight lines and right angles, avoiding any form of curve or distortion that could suggest movement or perspective depth. The wheels would appear as perfect circles or ovals (depending on the precise angle of the side view), and features like the door of the cab, windows, and the outlines of the trailer’s panels would be depicted with sharp, clean lines.

DALL-E 3 just can’t help but being a fxxx-up…

I have recently discovered that chatgpt sets the size parameter separately, regardless of what is requested in the dalle prompt. I have had success in instructing the gpt to “set the size parameter” when I need to.