Dalle3 prompt to generate pencil sketches keeps including pencils in image

Hello, any tips on how to get Dalle3 not to include pencils in the output image? I have a simple prompt like so:

Create a black and white hand-drawn image with a No. 2 pencil effect, featuring a futuristic car.

Thanks in advanced for any help.

1 Like

This is part of the problem, I think. We know what it means, but sometimes the model takes it literally instead of the style of sketch…

Try using other words. It still pops up occasionally for me as my prompts are auto-generated, but you can work around the limitation.

Related:

We’ve got a wealth of dalle3 threads here, including an AMA from the DALLE team (with answers…)

1 Like

Sometimes “natural-language” can just get in the way and it’s better to respect the model’s limitations and speak to it in its own language as best you can.

Using the Dall-E tool, please generate an image in accordance with this JSON-formatted specification. 

```json
{
  "subject": "futuristic car",
  "style": [
    "hand-drawn",
    "no. 2 pencil",
    "greyscale"
  ]
}
```

Result,

7 Likes

I was able to get around the problem by doing this: Hand-drawn graphite illustrations. Thanks everyone for the tips

1 Like

Is the style param in AzureOpenAI Dalle3 api?

Lol, it’s not a thing at all. It’s just something I used to ensure ChatGPT understood the difference between subject and style.

This is the prompt ChatGPT wound up sending to the Dall-E 3 model,

A futuristic car depicted in a hand-drawn style using a no. 2 pencil, presented in greyscale. The image should capture the sleek and advanced design of the vehicle, highlighting its innovative features and the smooth, aerodynamic shapes that define its silhouette.

2 Likes

Looks like hand-drawn style is better than hand-drawn image.

And yeah, you can force prompts too, but you’re still under the Terms of Service. I occasionally still get blasted by moderation for innocent things.

You hit on something I have written about here before,

Using specific and accurate art terms tends to improve performance.

3 Likes

DALLE2exp was a lot better at fine-controls… Like, for example, the difference between

“pencil drawing”
“pencil sketch”
“rough sketch”
“napkin doodle”

DALLE3 tends to lean toward the “best” unless you guide it.

One of the problems I’ve seen with a lot of people new to the tech is that they’ll just pop in “picture of a house” and expect the model to read their mind on the rest. DALLE3 is better at this as it rewrites the prompt and fleshes it out, BUT it can lead to unforeseen/unwanted results.

So, I try to recommend being as detailed in the prompt as you can be. It’s quite large in the API. I don’t remember the exact number of characters off-hand, but it was quite high.

1 Like

Sounds like you’ve solved this already, but I might add that in my experience any concept you put in the prompt like “pencil sketch” the AI will try to convey that in the image as if you want the image itself to convey that it’s a pencil drawing and so the best way to do that is show a pencil in the image.

In general, everything that’s mentioned will normally attempt to be included, so if you mention a pencil it will try to include a pencil. But if you say “pencil-styled sketch” it knows you mean stylistically and not a literal pencil.

3 Likes

Guys you’ve been a tremendous help, thanks again.

I have another question; how do you prevent Dalle3 from distorting faces? It seems like it’s fine 2 out of 5 times using the same prompt.

1 Like

If you concentrate on one or two people it usually does better (or if you include details), but it’s still rough at times. A lot better than DALLE2. Some of those generations still give me nightmares! :wink:

Seriously, though, you can play with the prompting, but you’ll still get misses…

1 Like

I’m just making a wild guess, but one term I’ve seen used in prompts is “photorealistic”, but it may not work with your particular issue. I’ve noticed faces in crowds get quite distorted a lot, and I agree with PaulBellow that the fewer people there are in an image the better their faces look.

2 Likes

And if you don’t supply numbers (two people, one person, one cake, etc) it will sometimes try to fill the entire image… Again, it comes down to being specific in your original prompt and hoping edits/inserts don’t mess it up too much.

I still occasionally get a South Asian Half-Orc or similar lol

1 Like

Yeah one way to even reduce the size of a person’s face in an image, is to specify other things that you want in the image, like objects in their surroundings. Since the LLM is going to have to fit those other things in the surroundings it will necessarily have to shrink the face to smaller than it might have otherwise been. So yeah, providing lots of details is key to getting good images.

2 Likes

So are you saying that I have to use the API and make a JSON request to get this to work?

I’m not saying that at all. I mentioned I’m using the api. Replace no. 2 pencil with the below to see if that helps.

1 Like

@michaelruddock It works just fine in the ChatGPT interface, you might have to tweak it around a bit though

3 Likes

@ideplo OK thank you, understood.
@trenton.dambrowitz thanks very much. :slight_smile:

2 Likes

It is also down to turns of phrasing:

2 Likes