Wrestling with funny hats and marching bands approaching

I had a long chat with ChatGPT-4o this a.m., trying to realize this prompt:

“Widescreen image, 1792x1024. Style is loose watercolor, filling the entire canvas from edge to edge, akin to watercolor paintings of the great 19th century book illustrators. A marching band is passing by in a small rural town in Siskiyou County, CA. The date is June 1st, 1894. The time is 7:30 a.m. Crowds line the sidewalks. The band is marching directly towards the camera, so that we can see their faces as they march. The point of view is that of an 8-year-old child sitting on his father’s shoulders, looking at the band as they come down the street towards him. All members of the band are normal, except that each has a very special and unique hat. These hats are three-dimensional, sculpted, or modeled versions of the ancient Greek alphabet letterforms. Those letterforms are: Α, α, Β, β, Γ, γ, Δ, δ, Ε, ε, Ζ, ζ, Η, η, Θ, θ, Ι, ι, Κ, κ, Λ ,λ, Μ, μ, Ν, ν, Ξ, ξ, Ο, ο, Π, π, Ρ, ρ, Σ, σ, ς, Τ, τ, Υ, υ, Φ, φ, Χ, χ, Ψ, ψ, Ω, ω. The band members are dressed in colorful uniforms, and the scene is lively and whimsical.”

The prompt was beyond that tool’s current capabilities. Specifically: it never provided the desired hats, and in 19 out of 20 images the band was marching AWAY from the camera, rather than towards it. About 1/3 of the images the result was square, inside a widescreen frame.

However: I still had fun. And the images are quite beautiful. The evocation of the small town at that time is quite excellent. Here are two of my favorite results.

Any suggestions for amending the prompt to get the result I’m seeking ?


– stan

1 Like

A few things.

  1. Go though this prompt and cut everything out that isn’t absolutely needed.
  2. Put the most important bits up front.
  3. As an exercise in learning and debugging, focus on individual elements separately—ensure you can generate something in the vein of what you want for each element before trying to combine them.

Even then, there will just be some things the model will not be able to do because it gets too hung up on certain words, phrases, or ideas.

I tried a bunch and I couldn’t get a Greek-letter-shaped hat on a marching band member at all.

Maybe try just getting a Greek-letter hat on a random person.