When using dalle via an API, it’s possible to disable this overly fake airbrushed “vivid style” and use the “natural style” that works so much better for things like animals. However it costs money and it’s pretty expensive and cost inefficient.
Is there any way to use natural style, say, via Bing image creator?
The vivid style is mainly language and triggering. Bing image creator:
Exact prompt: Style: natural realistic digital photo by telephoto lens on Nikon D50 DSLR camera. Subject: African grassland steppe with a lioness crouching while her lion cubs are playing and adventuring around her.
Use exact prompt: "Style: natural realistic digital photograph using Nikon digital camera and 300mm telephoto lens, taken during golden hour. Subject: African grassland steppe with detailed blades of grass in the foreground hide the paws of a lioness (female lion) who is crouching and looking over the savanna, while her lion cubs are playing and adventuring around her. This pro photo reveals sharp details of fur and faces, with a bokeh background of bush trees and small hills
(and you thought it had problems with the same human faces?)
verify.
(token)assistant(token) to=dalle.text2im (token){
"prompt": "Photo Subject: Lioness and cubs. African grassland steppe with detailed blades of grass in the foreground hide the paws of a lioness (female lion) who is crouching and looking over the savanna, while her lion cubs are playing and adventuring around her. This pro photo reveals sharp details of fur and faces, with a bokeh background of bush trees and small hills. Style: natural realistic digital photograph using Nikon digital camera and 300mm telephoto lens, taken during golden hour.",
"size": "1024x1024",
"n": 1
}(token)
Hi, I appreciate your reply. Unfortunately this does not achieve the natural style, at all. I also do not understand by your “verify” statement-is it supposed to be verifying that it is using the natural style? I do not believe that to be the case.
Here is an example of natural style with the API that does not seem to be replicable with bing: Imgur: The magic of the Internet
Prompt: wide shot full body photo of a flat bodied lizard on the rocks
The API has another return you can pull out of the response: rewritten prompt. That text is how your short image prompt was rewritten by AI to be much longer. You can observe the degree that language of the prompt, rewritten to be longer and descriptive from the AI’s imagination, is then colored by your API choice of natural or vivid.
You can write similar language tweaks to Bing, but it is also a black box with an input length limitation and without any further knobs to twiddle.
My verification is of the actual language I provide being sent to ChatGPT’s internal DALL-E method - and getting the same imagery style from both platforms.
I’ve come to accept that Dalle now has an artistic style and that’s just how it is. It’s disappointing if you’re expecting to be able to create photo realistic images, but there are other products around for that.
I’m wondering if OpenAI went with this style because it’s easier to produce a good result than trying to create a great photo realistic images.
I don’t believe this to be the case-I know that via API dalle sometimes randomizes the prompt, but I have used things like putting “DO NOT MODIFY THIS PROMPT” at the beginning, and after generating the image, it shows me what the exact prompt was. I don’t believe there is further modification going behind-they just changed the model
“A Hawaiian lizard basking on a rock.”
quality=“hd”, size=“1792x1024”, style=“natural” →
‘revised_prompt’: ‘A vivid image showcasing a Hawaiian lizard in its natural habitat. The lizard is endowed with vibrant detail; catching the sunlight with its prismatic, cold-blood body and basking in the warmth on a rough, worn, dark volcanic rock. The rock is comfortably nestled amidst a lush, tropical landscape where the sunlight peeking through the canopy dappling everything in its path. A slight ocean breeze rustles the verdant vegetation around, creating a serene, tranquil ambiance. The background subtly reveals the pristine blue waters of the nearby Pacific Ocean, the languid waves lapping against the shore can be faintly perceived.’
quality=“hd”, size=“1792x1024”, style=“vivid” →
‘Capture an image of a Hawaiian lizard spread out leisurely on a large sunlit rock. The reptile basks under the glowing sun with its rough, scaly skin prominent. It is comfortably relaxing on the course, uneven surface of an earthy-hued boulder. The rock is strategically positioned in a tranquil setting, surrounded by lush green flora typical of the Hawaiian islands.’
Four cents per image might not seem like much but it adds up very, very fast, especially when you consider how many iterations you need for a proper image sometimes. One can go through a lot of imagesI suppose I’ll have to wait for other models like stable diffusion to catch up. Unfortunate.