I am calling DALLE3 api thru this code:
And this is my prompt:
A female social carer, face seen, stands in a marketplace, hands working dough. Stalls around her display global ingredients. Behind, a cobblestone path leads to a luminous bridge with volunteers holding instruments, silhouetted against the sunrise. To her right, a trail enters a forest; to her left, an open square resonates with music. An ancient library looms in the background. The scene blends vibrant colors with golden dawn light, capturing a world of cultural unity in a realistic style.
And this is the result:
Even considering that the prompt is revised:
revised_prompt=‘An Hispanic female social carer is seen standing in the bustling marketplace, her hands skillfully working a piece of dough. Stalls around her display an array of global ingredients, a testament to culinary diversity. In the background, a charming cobblestone path leads to a glowing bridge where silhouetted figures, seen as volunteers, hold myriad of instruments against the canvas of a sunrise. To her right, a trail enticingly beckons towards a lush forest while on her left, an open cobblestone town square echoes with resonant music. Further in the backdrop, an ancient, grand library towers, adding a historical touch. The scene splendidly weaves vibrant hues with the golden light of dawn, encapsulating the essence of cultural unity in the realism art style.’
The quality is way behind. For example, if I feed the same revised prompt to microsoft image creator (which I guess will do some prompt preprocessing but still is based on dalle3), I get:
a much better result with a unified style and better aesthetics.
Is this expected? I’m wondering if there’s a bug or a misconfiguration that causes the model to fall back to dalle2. I’ve tried updated the openai python library just in case but didn’t help.
Here’s another example.
revised_prompt=“A Japanese male designer, his face visible, stands in a designer’s atrium, brush poised over a parchment adorned with sketches of fantastical creatures. The light trickles in from a crystal dome above, casting prismatic hues that make the scene vibrant and lively. In the background, there are houses shaped like action figures, independent game arcades and a cobblestone path leading to a quaint tea shop. The hills beyond are swathed in multicolored foliage, ading to the imaginative landscape. The style of the image imparts a sense of dynamism and vigor to the scene.”
And this is happening consistently to me, with every prompt I try.