Hi, I am trying to use the model ‘gpt-4o-audio-preview’ to translate an English language talk on meditation to various different languages.
I interacted with the API through Colab but received a response that I have no access to the model. Is this model only available to ‘beta’ users, subscribers of ChatGPT Plus or other plans?
For doing straight translations, and receiving spoken output, the model is far too unreliable and expensive. It is an AI that best responds to spoken audio with the voice of an entity answering as a chat partner. Check out TTS on the audio endpoint. You can have an AI do initial instructed text translation at well under 5% the cost of having it use GPT-4o voice. Then you can evaluate the level of American foreigner accent each of the voice models impart.
There is no specific restriction on use; it is granted similarly to normal gpt-4o. You can check and expand your account’s model rate limits, and see that the desired model is listed there. Generate another API key in a project, and see the model is listed as one you can enable and restrict in the project with API key.
One other thing to check: The denial may be from old OpenAI SDKs. You can make direct JSON https calls if you’re not sure if the API call is being blocked by an old library version blocking the unknown model name.
Thank you for your kind help and reply. I did not know that I had to purchase credits to use the API, it worked once I managed to get credits. It is indeed expensive. For 4 mins of audio, it costs $3. I will try your advice to check out the Audio models. Thank you.