Hi there!
I know it’s kind of an odd question but I am actually using gpt-4o-audio-preview like follows:
- Single audio file input
- Streaming text output
Here is the relevant part of my code:
response = await aclient.chat.completions.create(
model="gpt-4o-audio-preview",
modalities=["text", "audio"],
audio={"voice": "alloy", "format": "pcm16"},
messages=[
{"role": "system", "content": [{"type": "text", "text": "REDACTED PROMPT"}]},
{"role": "user", "content": [
{"type": "input_audio", "input_audio": {
"data": encoded_audio,
"format": "wav"
}}
]}
],
stream=True
)
What happens is, by examining the chunks returned by the API, I noticed that it returns both transcripts (text) chunks AND audio chunks.
Since I got no use for the audio chunks being returned, is there a way to prevent the API from returning them and therefore not paying for them? I’m only interested in the text response.
Thank you!