Hi,
I was very impressed by the demos of the GPT-4 Omni model and had a query regarding its capabilities.
I am currently exploring the use of the Speech-to-Text transcribe functionality as input for GPT-3.5-turbo completions. At present, this requires two separate API requests:
- Transcription
- Prompt Completion
I was wondering if it is possible, or if there are plans to implement, a way to combine these steps into a single API request, similar to the examples provided in the Vision API documentation: https://platform.openai.com/docs/guides/vision
Thank you for your time and assistance.
Best regards,
Roderik Steenbergen