Gpt-4-turbo slightly worse than gpt-4-turbo-preview when generating text based on images?

I have a prompt for generating documentation content starting from some initial text, images and instructions.

In my tests, the new ‘gpt-4-turbo’ performed sligtly worse overall when compared to ‘gpt-4-turbo-preview’. At least that was my feeling. There were also instances where it performed better, but overall I was a bit dissapointed with it (maybe because I expected it to be even better than the preview model).

For example, it created a list with only one bullet where the preview model rightly generated a paragraph. Or it didn’t create the expected XML markup (based on an example), where the preview model did create it.

In my experience, it does not follow my directions (on formatting and citation for text generation) nearly as well as -preview model. I reverted back immediately.