Language code for telugu 'te' is not recognised. Getting "Language 'te' is not supported" error

lankesh.87 · February 23, 2024, 12:02pm

Description:
When trying to convert telugu speech to text using Python SDK for OpenAI, getting following error:

Error code: 400 - {'error': {'message': \"Language 'te' is not supported.\", 'type': 'invalid_request_error', 'param': 'language', 'code': 'unsupported_language'}}

Steps to reproduce:

Record audio in telugu
Using python sdk for OpenAI, call audio.transcriptions.create function by passing language="te"
Function call should return following error message: Language 'te' is not supported

Sample Code:

filepath="telugu_speech.mp3"
client = OpenAI(api_key="API_KEY")
with open(filepath, "rb") as f:
    client.audio.transcriptions.create(model="whisper-1",file=f,response_format="text",language="te")

Output:

Error code: 400 - {'error': {'message': \"Language 'te' is not supported.\", 'type': 'invalid_request_error', 'param': 'language', 'code': 'unsupported_language'}}

Reference links:
According to the audio.transcriptions.create function documentation,te is a valid language code for telugu language. Check this:

          language: The language of the input audio. Supplying the input language in
              [ISO-639-1](https://en.wikipedia.org/wiki/List_of_ISO_639-1_codes) format will
              improve accuracy and latency.

_j · February 23, 2024, 12:50pm

It may be a valid language, but it is one of 125 ISO codes not supported as a specification by whisper.

Whisper (transcribe) API verbose_json results, format of language property? Documentation

I upped that call count to 183 for all of ISO space. To get 57 valid abbreviations and languages. [‘af’, ‘ar’, ‘hy’, ‘az’, ‘be’, ‘bs’, ‘bg’, ‘ca’, ‘zh’, ‘hr’, ‘cs’, ‘da’, ‘nl’, ‘en’, ‘et’, ‘fi’, ‘fr’, ‘gl’, ‘de’, ‘el’, ‘he’, ‘hi’, ‘hu’, ‘is’, ‘id’, ‘it’, ‘ja’, ‘kn’, ‘kk’, ‘ko’, ‘lv’, ‘lt’, ‘mk’, ‘ms’, ‘mi’, ‘mr’, ‘ne’, ‘no’, ‘fa’, ‘pl’, ‘pt’, ‘ro’, ‘ru’, ‘sr’, ‘sk’, ‘sl’, ‘es’, ‘sw’, ‘sv’, ‘tl’, ‘ta’, ‘th’, ‘tr’, ‘uk’, ‘ur’, ‘vi’, ‘cy’] [‘afrikaans’, ‘arabic’, ‘armenian’, ‘azerbaijani’, ‘bela…

It may be that there is simply nothing to “trigger” to improve the inference on additional languages, and that the training set has very little linguistics.

Topic		Replies	Views
Issue with Whisper ASR: Incorrect Language Transcription for Malayalam, Nepali, Telugu, and Others Feedback	1	522	September 23, 2025
Whisper Translation API documentation bug API whisper	4	2125	March 8, 2023
OpenAI whisper model is generating '...' for non-english audios Bugs whisper	0	94	December 9, 2024
Troubleshooting OpenAI's Whisper Model: Resolving Incorrect Language Outputs for Maithili with Multilanguage Tokenizer Community whisper	1	303	September 18, 2024
Incorrect Transcription - Arabic voice returns Hebrew text Bugs whisper	0	165	October 2, 2024

Language code for telugu 'te' is not recognised. Getting "Language 'te' is not supported" error

Related topics