Weird safety system catch when generating image

Overview: I’m using GPT to generate a visual summary of what a player sees in a scene in a game.

The output is as follows: Rikon stands at the entrance of the Hall of Clockwork Wonders, a complex of towering spires and intricate clockwork mechanisms. The hall is a marvel of steampunk technology, with a variety of exhibits, from automatons to flying machines to steam-powered robots. He can see an automaton standing at 2m tall, a steam-powered robot standing at 1.5m tall, a flying machine at 2m tall, a clockwork table at 5m tall, and a steam-powered generator at 4m tall. Further in the hall, he can make out a clockwork clock at 5m tall and a magical artifact and magical mirror at 1m tall. To the side, Rikon can see an automaton workshop at 4m tall and a steam-powered elevator at 3m tall. He takes a deep breath as he prepares to enter the warehouse and retrieve the artifacts.

Problem: For some reason, even though it was GPT that generated the output, the image model rejects it, saying that the text may not be allowed.

Workaround: I do have a workaround that may be okay, but more expensive. I created a loop that catches this error, and when it occurs, the description is fed into a new prompt which asks GPT to rephrase it. That seems to work after a few tries, most of the time anyway. But it’s still weird. What exactly is being caught?!

Hello there,

My understanding of the Image Model API is that it has a 400 characters limitation for the prompt and your text is over 700 characters long.

So indeed asking GPT to summarize in less than 400 characters could help you generate prompt that fits the image model, although it gets expensive (2 completions + 1 image). This is a question I have asked on another thread since I would also like to illustrate stories generated by GPT-3 but too long to serve as image prompts…

The image generation model allows for up to 1,000 characters.

1 Like