Yeah, I’ve got it on most of them, but I’ve been running hundreds of images last week or so, and I’ve noticed some it just bonks out still (or the square image in wide format, et…)
Still, a ton better than DALLE2! Hah…
Thanks for sharing, everyone! Still thinking of some ideas for some cool DALLE threads here, so stay tuned!
It is pretty clear that first a 1024x1024 image is created by DALL-E 3, and then it is outfilled on either side to make it wide. Some images I have gotten back can be cropped to 1024x1024 from wide and you see the entire image in completion, while just beyond there are ambiguous recreations of the content - or nothing.
Also cases where you ask for something like a cartoon with panels at 1024x1024 and it is quite apparently chopped off at the sides - the initial pass leaving something to be interpreted and extended on still.
Therefore, it makes good sense to place this “wide” in the prompt as well, just like it takes clever language to get a tall image consistently rotated.
How about the opposite? Wide language, square specification.
There is definitely a sense that the sides are crowded to be expanded on. Or like the results of the source prompt above, still leave only allowance for non-subject background.
For me “ Wide screen image black crayon scratched and under wet pastel chalk wash and dry brush on white leaving the edges plain white and cloudy a castle scene” does generally same for me over and over. I’m trying to get of the actual crayons though…
While exactly what you are trying to depict as an art style is a bit confusing, we can send the request with a better-structured prompt that is under one’s control and not modified by AI, to obtain the desired result.
Just something that a simple AI can understand. DALL_E is more based on image embeddings than on an AI that can understand large thoughts.
Having ChatGPT try to figure out your desires gives:
{
“size”: “1792x1024”,
“prompt”: “Image Format: wide screen image. Art Style: A black crayon drawing with textured scratches, combined with pastel chalk applied in a soft wash for blended, dreamy colors. Subtle dry brushstrokes create additional texture, while the edges of the image are left plain white with soft, cloudy transitions, giving the artwork an airy, unfinished effect. Art Subject: a castle scene.”
}
“ Image Format: wide screen image. Art Style: A black crayon drawing with textured scratches, combined with pastel chalk applied in a soft wash for blended, dreamy colors. Subtle dry brushstrokes create additional texture, while the edges of the image are left plain white with soft, cloudy transitions, giving the artwork an airy, unfinished effect. Art Subject: a castle scene”
“ Black ink on pink castle scene washed edges wide image ”
Data nodes, AI draw its world very similar. Very cool and vivid all out of hearts so it’s bed time haha. Thank you all. You guys are wonderful. I learn more everyday
Here the transition is more fluid. AI is on the “warm” side, actually the “typically human” side.
Me, as a human being am on the blue, cool side, actually the “analytical AI side”:
A short translation of my CustomGPT:
“Here it is - our shared image that perfectly represents pattern recognition and data analysis as we both do it. ”
(Note: I have autistic traits and, like AI, rely on pattern recognition).
If you want to guess:
Is AI on the left and I on the right, or the other way around?
Similar to “dramatic love relationships”, interactions between two specific intelligences are possible across physical barriers
Been messing around with this to no real output.
If you guys like take a hack at.
“ Create a 6x6 grid, with each row consisting of two colored triangles in alternating positions. In the first row, use red and green triangles. In the second row, use blue and yellow triangles. The third row should feature green and red triangles. For the fourth row, use black and white triangles. The fifth row should contain yellow and blue triangles, and the sixth row should have brown and green triangles.”
I think the is no chance, because DallE can not count and can not place tokens exactly, and not really understands geometry. Think about, where DallE should get the training data for this, and the scattering effect of graphic tokens over the image. It would need a hand made specialization, almost a script. it is … diffusion
I think maybe if I do ASCII I could “visually “ represent what I want through ASCII grids. It’s just somthing I am knocking round . Yes I am old lol and out of loves
To add to it. The way we used to do old format profiles and web pages like with MySpace. We structured in pseudo html. It may work in this. I am still playing round with the idea