How can I get word level segments from the API?

Hi everyone. I would like to know it the word level segmentation is already available in the Whisper API (even if it’s in experimental mode). If not, does anyone know when it, if so, will be launched?

There is no direct method to get back more than a string.

Don’t see why it would be needed on their end.

import os
import openai
openai.api_key = os.getenv("OPENAI_API_KEY")
audio_file = open("audio.mp3", "rb")
transcript = openai.Audio.transcribe("whisper-1", audio_file)
word_list = transcript['text'].split()
print(word_list)

[‘This’, ‘is’, ‘a’, ‘radio’, ‘show’, ‘where’, ‘people’, ‘call’, ‘us’, ‘and’, ‘ask’, ‘us’, ‘questions’, ‘about’, ‘cars,’, ‘right?’, ‘And’, ‘what’, ‘were’, ‘we’, ‘just’, ‘talking’, ‘about’, ‘before’, ‘the’, ‘mics’, ‘came’, ‘on?’, ‘We’, ‘were’, ‘both’, ‘talking’, ‘about’, “what’s”, ‘wrong’, ‘with’, ‘our’, ‘respective’, ‘vehicles.’, ‘This’, ‘has’, ‘happened’, ‘in’, ‘the’, ‘mind’, ‘that’, ‘charges’, ‘their’, ‘systems’, “aren’t”, ‘working.’, “It’s”, ‘pretty’, ‘sad.’, ‘Well,’, ‘the’, ‘real’, ‘question’, ‘is,’, ‘who’, ‘do’, ‘we’, ‘call?’, ‘Who’, ‘do’, ‘we’, ‘call?’, ‘I’, ‘call’, ‘you’, ‘when’, ‘I’, ‘have’, ‘a’, ‘problem.’]