I noticed that the TTS endpoint already appears in the api documentation (OpenAI Platform), but when trying to use it I received the following return: The model tts-1 does not exist or you do not have access to it.
DOES ANYONE KNOW HOW TO GET ACCESS TO THESE MODELS?
I’m hitting the same wall with the TTS models. Followed the docs to the letter but still getting the ‘model does not exist or access not allowed’ message. If you find a way to get this working or hear back from OpenAI on this, I’d love to get a heads up!
So excited about this but I feel like latency will still be an issue for a use case where you’re trying to have real-time conversations. Would be great if we could have the option of telling the chat completions endpoint to return audio instead of text. Judging by how fast this is all moving I’m sure that’s a few weeks away.
Right but it looks like we pass in the input text to be spoken to this new endpoint (which for this use case would be the output of an LLM). So, user finishes speaking, pass that input to chat completion, take result of chat completion and pass it to text to speech endpoint. That’s the latency I’m worried about. Anyways, definitely getting closer.
They did not say anything about languages supported. (Although bit disappointed), I understand if it is only English for a start, but maybe you should not pretend that the rest of the world does not exist
Yes I am having this issue where I am giving TTS-1 the stream from gpt-4 and it doesn’t seem to work well. It only works well if you pass an entire message to it and then stream the audio but again like you mentioned the latency of doing so is an issue here.