Does anyone know a reliable way for controlling perspective of isometric images in GPT-4o and Image 1?

I sometimes use ChatGPT to generate isometric images of objects like this:

So far, I haven’t found a reliable way to control which side (left or right) of the image should appear closer to the camera/viewer.

Here is one way I have tried to control GPT-4o’s camera, but they usually re-render the object from the same perspective as the first prompt:

Left side closer to viewer

# left-side closer to viewer, 3/4 perspective

**Technical Specifications:**
- Camera: 3/4 view, 25mm lens
- Camera angle: 30° yaw, 20° pitch (left side closer to viewer)
- Depth of field: f/8 for full sharpness

Right side closer to viewer

# right-side closer to viewer, 3/4 perspective

**Technical Specifications:**
- Camera: 3/4 view, 25mm lens
- Camera angle: -30° yaw, 20° pitch (right side closer to viewer)
- Depth of field: f/8 for full sharpness

Looks like 30° yaw and -30° yaw are ignored.

If anybody knows a reliable way to instruct GPT-4o (Image 1) to control which side of an isometric image should appear closer to the camera/viewer, I would appreciate it if they could share it.

Thanks in advance.

I tried prompts below twice for each side with different background colors.
This is interesting; when I tried in same chat it did not change view, but when I tried in a different chat it changed. Of course lines of image also changed.

Prompt-For RIGHT face

Generate a clean wire-frame render of a rounded-square block with a smiley-face emboss.
• Projection: true isometric
• Viewpoint: observer stands at the front-right corner
– The RIGHT face is the broad, closest face to camera.
– Only a sliver of the front face is visible; the left face is entirely hidden.
• Lighting: key-light from viewer’s position, casting a soft shadow toward the far left.
• Background: pure red


Prompt-For LEFT face

Generate a clean wire-frame render of a rounded-square block with a smiley-face emboss.
• Projection: true isometric
• Viewpoint: observer stands at the front-left corner
– The LEFT face is the broad, closest face to camera.
– Only a sliver of the front face is visible; the right face is entirely hidden.
• Lighting: key-light from viewer’s position, casting a soft shadow toward the far right.
• Background: pure green

Hey @aggressiveGarlic. I like that nickname. :smiley:

Did you ever try to:

1st: Reverse an image like this?
Which means: Giving chatgpt an isometric image like yours in a new chat and ask:
What may have been the prompt for this, including the angle, camera settings, etc.?

Absolutely. Here are 3 GPT-4o image prompts that reliably control the isometric perspective by varying the camera angle and orientation. Each prompt targets a different isometric viewpoint:


:white_check_mark: Bonus Notes:

  • These prompts avoid ambiguous words like “from above” alone, instead, they specify azimuth (direction) and elevation (angle).
  • You can tweak with phrases like “make left wall more dominant” or “rotate 15° to the right” in follow-up prompts if needed. And so on.

I hope this all makes sense so far.

2nd: Try to include the perspective and the camera angles?

1st try: Front-Left Isometric (Classic)

Prompt

An isometric view of a cozy coffee shop, viewed from a 45° top-down angle, with the front and left walls equally visible. The scene should look like a miniature 3D diorama, with tables, chairs, and customers inside. The roof is removed for interior visibility.

2nd try: Rear-Right Isometric (Inverse / less common)

Prompt

Isometric cutaway of a tech repair workshop, seen from a 45° top-down rear-right angle. The camera is positioned above and slightly behind the building, showing the right and back walls equally. Remove the ceiling and front wall for interior visibility. Include shelves, tools, and workbenches.

3rd try: Shallow Isometric (Lower elevation = more dramatic)

A futuristic isometric lab rendered at a low top-down angle of 30°, facing the front-right corner. The viewpoint should feel closer to eye level, showing more of the vertical height of objects and less of the floor. Include holograms, glowing equipment, and scientists working inside.

I hope that helps.

I’d suggest: You could even ask them in o3 (I created them trying those in 4o)
and maybe even help you to automatically either derive the perspective from a given pic OR create, respectively ADD the angle descriptions automatically, then you should be probably able to use: manually phrased technical language, even common normal language and it should recognize it into a prompt that makes sense and makes the pic show up to your liking.

P.S.: Please keep in mind that I NOT did iteration to improve those to make them actually appear correctly as much as possible each time. This is just to give you a start.

Because already pic 1 and 2 look to me basically like the same perspective, I guess?

And maybe having used the same thing each time would have been a better idea. :smiley:

In prompt 1, isn’t it doing the opposite!?

Instead of a cafe, cam you try with a “monolith that has a smiley engraved on it”?

Is that what you need?

I tried only oncw, i guess the eyes may get better