The audio ID itself cannot be replayed or recovered. That assistant output is stored server-side just for continuing a conversation, with expiration.
The obvious reason for this ID system for chat history audio is because OpenAI doesn’t want the ability for developers to place their own audio in API requests for the voice or messages the assistant responds with, to retrain output with in-context learning. It also breaks long-term chat continuations by expiring.
You’ll have to save the original response message and its generated audio part, to allow a chat UI to replay what was previously spoken if that’s the desired application.
For that response data collection as example, I just tacked on an audio extractor to replace streaming tool, function, and other object collection from a Python httpx request (not OpenAI SDK).
if response.status_code != 200:
print(f"HTTP error {response.status_code}: {response.text}")
# retry/reprompt
continue
else:
print("API request: success")
response_content = b''
for chunk in response.iter_bytes(chunk_size=8192):
if chunk:
response_content += chunk
response_data = json.loads(response_content.decode('utf-8'))
if 'choices' in response_data and response_data['choices']:
print("-- choices list received --")
choice = response_data['choices'][0]['message']
reply = choice.get('content', "")
audio_data = choice.get('audio', {})
audio_base64 = audio_data.get('data', "")
transcript = audio_data.get('transcript', "")
print(reply if reply is not None else '', transcript if transcript is not None else '')
print("\n", response_data.get('usage', {}))
if audio_base64:
save_and_play_audio(audio_base64, VOICE)
# use the ID if you really want, I don't
chat.append({"role": "assistant", "content": reply or transcript or ""})
user_input = input("\nPrompt: ")
user_message = {"role": "user", "content": user_input}
chat.append(user_message)
else:
print("No valid response received.")
...