I am using AzureOpenai client, with that i am not able to get word level granularity, but with openai client it gives the data properly!
Here is a short code snippet
from openai import OpenAI, AzureOpenAI
openai_client = OpenAI()
azure_client = AzureOpenAI(
api_key=api_key,
api_version="2024-02-15-preview",
azure_endpoint=azure_endpoint
)
audio_file = open('test.m4a', "rb")
azure_transcript = azure_client.audio.transcriptions.create(
file=audio_file,
model="whisper-1",
response_format="verbose_json",
timestamp_granularities=['word']
)
openai_transcript = openai_client.audio.transcriptions.create(
file=audio_file,
model="whisper-1",
response_format="verbose_json",
timestamp_granularities=['word']
)
print(openai_transcript.words) ==> Gives correct results
print(azure_transcript.words) ==> returns None, azure_transcript.segments has the data, but they are not based on word level!