Need for Character Consistency and Style Locking in Image Generation

I’m a dedicated user of ChatGPT’s image generation feature, primarily using it for an ongoing visual storytelling project.

My work relies heavily on consistent depiction of recurring characters — specifically triplet sisters named Hi, Fu, and Mi — with identical faces, body types, clothing, and background tones. However, I’m constantly encountering serious issues with image consistency that undermine creative control and continuity.

Here are the core problems I’ve faced:

  1. Inconsistent character representation

    • The same character is rendered with different skin tones, body proportions, belly size, or even completely different facial features.
  2. Inconsistent art style and color tone

    • Despite requesting the same style, the lighting, linework, or saturation change unexpectedly between images, breaking visual continuity.
  3. AI interpretation overrides user intent

    • Instead of prioritizing user instructions, the AI often applies its own interpretation of what is “suitable,” which overrides detailed creative direction and ruins emotional nuance.

As a storyteller, this breaks immersion and makes it nearly impossible to build a stable narrative with coherent visuals.


:wrench: What’s urgently needed:

  • A way to lock art style, referencing a previous image or using a fixed rendering mode
  • Consistent character rendering, preserving facial features, body shape, skin tone, clothing, and pose across requests
  • An option to fix lighting and color tone for a defined visual mood
  • A “User Intent Priority Mode” to prevent AI from altering the meaning or feel of a request

Without these, serious creators and storytellers like myself will eventually leave the platform — not because the technology isn’t powerful, but because it doesn’t listen.

I truly hope this feedback helps improve the experience for creators who depend on clarity, consistency, and control in visual storytelling.

Thank you.

4 Likes

Hi welcome to the community!

You can use multi-turn techniques and use a DNA Template before each scene:

DNA Template

All images must be wide size 3:2 aspect ratio watercolor anime style images.
In this session, we are creating a sequence of visual scenes titled ‘The Harmony of Three’.
The focus is on capturing subtle emotional moments between triplet sisters Hi, Fu, and Mi.
Each image builds the atmosphere of quiet connection and narrative continuity.
Keep all visual elements stable across images: characters, style, tone, and background environment.
All scenes take place in a serene coastal village with sakura trees, paper lanterns, and pastel-toned skies.

This scene features three identical triplet sisters named Hi, Fu, and Mi.
All three girls have identical appearances:

  • Light tan skin, round face with soft cheeks, and large almond-shaped hazel eyes.
  • Straight, shoulder-length silky black hair with front bangs.
  • They are average height, slender with graceful posture, and have flat stomachs and elegant proportions.
  • They wear traditional pastel pink dresses with slight variation:
    • Hi wears a pink hair ribbon
    • Fu wears a silver pendant
    • Mi holds a small light brown diary
  • Their expressions are calm and emotionally reflective.
    Their appearance must remain perfectly consistent in every image.

Render in soft anime watercolor style, with gentle linework and muted pastel tones.
Use ambient diffused lighting, with no hard shadows or saturation changes between scenes.
Backgrounds should use soft, desaturated colors with a gentle painterly effect.
Maintain a cinematic horizontal composition with medium-wide shot unless otherwise specified.

I will provide scenes. and you will create one by one, and each one is wide size 3:2 aspect ratio watercolor anime style images.

Some related topica:

A test for "Multi-Turn Generation" with 4o Image Generation

Prompt to make exactly same image but different pose - #23 by polepole

Create images with text and without any spelling mistakes - #5 by polepole

Your DALL-E problems now solved by GPT-4o multimodal image creation in ChatGPT? - #44 by polepole

4o ImageGen: Share your best pictures - #132 by polepole

1 Like

This topic was automatically closed 2 days after the last reply. New replies are no longer allowed.