Gpt-4o-mini-tts voice inconsistency between requests

I use TTS models for generating multi speaker dialogs. Each replica is generated in different request using same “instructions” for each speaker. However voices are really inconsistent despite same instructions and the resulting audio sounds like there are many different speakers.

This makes this API unusable for my specific use case, so switching back to TTS-1

have you tried improving the prompt?