what’s the difference between o1-pro and o1 with long reasoning_effort?
What is the future of the Assistants API?
Specifically im looking for:
- better support / custom libraries / sandboxing for code interpretor
- support for passing a thread between assistants - it was disappointing that swarm wasnt build with the assistants API
- More agentic capabilities - Goals, Planning, memory, task handoff between assistants. I want it to go away and do something (transparently) and come back with the “complete” output.
- more transparency into the thinking tokens in o1 - a summary would be fine
- the ability for o1 to use function calling in the thinking tokens.
Have you tried generating your own datasets for DPO using O1?
Great work, team!
Do you plan to keep lowering the cost of the realtime API? A 60% price drop is simply amazing but it keeps the costs too high for some of our products.
Quick question:
- Any plans to reduce gpt-4o-mini pricing to be competitive with Gemini Flash 1.5 / 2.0 (assuming 2.0 will be just as competitive in pricing)?
Thanks!
Are expecting responses by video or text?
-
will OpenAI also support the “prefill” for greater output control that Claude already has? It seems quite easy to implement, or is there a reason OpenAI doesnot support this?
-
for the direct preference optimization (DPO), is it better to sample the preferred response and the non preferred response from the model that is finetuned or does it not matter where they come from. It seems more logical to sample them from the same model, but it can be hard to get 2 responses in a clear different style.
Will function calling in OpenAI API and GPT Builder Actions start working with multipart/form-data content type to support sending files to a developer’s API?
In the Playground, the beta feature for designing a json schema is really useful. Could it be given to us as an endpoint? Maybe we could then hook it up to a custom GPT as a function call so we can “chat with” our responseFormat.
they also published 4o mini realtime api which is like 10x cheaper. AMAZING
I’ve tried the Realtime API a few days ago in Portuguese and unfortunately the results were pretty bad. I assume it uses whisper small under the hood which makes it useless in Portuguese. Should we expect a better performance with todays release?
Great mini dev day!
Would love more clarity on using System Message vs Developer Message in the api? Has System Message gone away? Or Developer Messages are an addition to it?
-
Do we expect o1 and the realtime API to eventually converge? Or if we want an audio model that’s smart, will we need to call o1 as a tool call from the realtime API?
-
Related: Do you have it on the roadmap for the realtime API to take video input? Or the o1 API to take audio or video input (similar to Gemini 2.0)?
Here’s a clearer version of your question:
Would it be possible to automate the DPO (Direct Preference Optimization) training data collection process? I’m considering two approaches:
- Using a fine-tuned model as a preference judge
- Using an OpenAI model as a general preference judge
The workflow would be:
- Input: List of prompts
- Output: Multiple model responses per prompt
- Judge model: Assigns preferences between responses
- Result: Automated DPO training dataset
Additionally, can DPO handle multi-turn conversations where we provide a complete dialogue (user-assistant-user-assistant) with only the final assistant response varying?
When should we expect price reductions for whisper and update for whisper v3 (which you guys launched more than a year ago) on the API? Please send whisper some love.
Any plans to make prompt caching better? Specifically, allow specification of what content to cache, give it a variable name, and allow reference to the cache in subsequent API calls for a certain time duration. I know another company is doing this
Great updates on Realtime API.
- Any timeline as to when the issue with the realtime model dropping out of voice mode when reconstructing conversation history is going to get resolved?
- Also the audio getting randomly cutoff at the end from time to time.
These two are probably the most critical issues of the API right now.
Well we got: o1 w vision via api (w reasoning guide params) I’m happy :') ty guys
We don’t have plans for a Sora API yet, but we’d love to hear more! What will you build if we ship one?