Language code for telugu 'te' is not recognised. Getting "Language 'te' is not supported" error

Description:
When trying to convert telugu speech to text using Python SDK for OpenAI, getting following error:

Error code: 400 - {'error': {'message': \"Language 'te' is not supported.\", 'type': 'invalid_request_error', 'param': 'language', 'code': 'unsupported_language'}}

Steps to reproduce:

  1. Record audio in telugu
  2. Using python sdk for OpenAI, call audio.transcriptions.create function by passing language="te"
  3. Function call should return following error message: Language 'te' is not supported

Sample Code:

filepath="telugu_speech.mp3"
client = OpenAI(api_key="API_KEY")
with open(filepath, "rb") as f:
    client.audio.transcriptions.create(model="whisper-1",file=f,response_format="text",language="te")

Output:

Error code: 400 - {'error': {'message': \"Language 'te' is not supported.\", 'type': 'invalid_request_error', 'param': 'language', 'code': 'unsupported_language'}}

Reference links:
According to the audio.transcriptions.create function documentation,te is a valid language code for telugu language. Check this:

          language: The language of the input audio. Supplying the input language in
              [ISO-639-1](https://en.wikipedia.org/wiki/List_of_ISO_639-1_codes) format will
              improve accuracy and latency.

It may be a valid language, but it is one of 125 ISO codes not supported as a specification by whisper.

It may be that there is simply nothing to “trigger” to improve the inference on additional languages, and that the training set has very little linguistics.

1 Like