what does the temperature parameter in the Whisper Create transcription API call do?

In the documentation for Create transcription it mentions a temperature parameter. “The sampling temperature, between 0 and 1. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. If set to 0, the model will use log probability to automatically increase the temperature until certain thresholds are hit.”, the description reads.

In the case of ChatGPT this makes sense. You ask it a question and the higher the temperature the less likely you are to get a different response.

But in the case of transcriptions, ideally what you’d get back would be deterministic.

Maybe an example could be provided of an audio clip being transcribed multiple times with a higher temperature?

3 Likes