AMA on the 17th of December with OpenAI's API Team: Post Your Questions Here

When should we expect price reductions for whisper and update for whisper v3 (which you guys launched more than a year ago) on the API? Please send whisper some love.

Any plans to make prompt caching better? Specifically, allow specification of what content to cache, give it a variable name, and allow reference to the cache in subsequent API calls for a certain time duration. I know another company is doing this :slight_smile:

Great updates on Realtime API.

  • Any timeline as to when the issue with the realtime model dropping out of voice mode when reconstructing conversation history is going to get resolved?
  • Also the audio getting randomly cutoff at the end from time to time.
    These two are probably the most critical issues of the API right now.
2 Likes

Well we got: o1 w vision via api (w reasoning guide params) I’m happy :') ty guys

Hey Lou -- sorry about this unfortunate behavior. This can happen when the model outputs are sufficiently improbable so the only valid token per your schema is the newline token. It's not really a bug in that engineering changes cannot fix it. When this happens, I would suggest trying to use a simpler schema or changing your prompting so it's more clear to the model what it should respond with. Let me know if this helps!
6 Likes

We don’t have plans for a Sora API yet, but we’d love to hear more! What will you build if we ship one? :slight_smile:

8 Likes

since this is a AMA for the API team, I wonder, do you guys have plans to release documentation examples of frontend microphone usage on different languages and frameworks that work seamlessly as examples on any github repository?

i feel like that would help many build more reliable features with faster integration

1 Like

We’re seeing a lot of developers building with the Assistants API, and we’re continuing to invest in the tools developers need to build agents. Next year, we plan to bring o1 to the Assistants API and will have more to share soon!

10 Likes

Ooh. I would love to automate building relaxing, tranquil looping background videos to go alongside my custom-made music.

I’d also like to build N videos for a prompt and be able to approve them for future stock videos. Maybe even incorporate a vision model to somehow rank them before being sent to me for approval

3 Likes

Can’t wait for O1 with vision!
Really think ANY business can profit from this, as all of them use badly formatted / scanned in / manually written on PDFs.

Looking forward to replace a pipeline of 20 LLM calls to a single call.

That leads me to my question:
with the release of better models (gpt3 > gpt4 > gpt4o > o1), do you see a spike in reduced overall tokens used via the API as people replace longer prompts or multiple inference call with simpler prompts or less inference calls?

1 Like

What is OpenAI’s view on Model Context Protocall?

It would be great if everyone is on the same page about how we’ll build external connections to OpenAI APIs going forward.

Ideally you guys would come out and support it.

3 Likes

More generally, what are we as developers not doing as much as you think we should? What do you wish we did differently, or more or less of? We take constructive criticism too :upside_down_face:

2 Likes

For Sora? o1 story or multiple scenes + chain of requests to Sora API

1 Like

Nothing to share yet on V3 Whisper in the API. But for both audio understanding and TTS, do check out the new GPT-4o mini audio preview model. It’s got state of the art speech understanding and you can prompt the model directly to control how it hears and speaks! For example, give it a prompt like "Say the following in a somber tone, and make sure to pause your speech appropriately: "

6 Likes

If you wish to discuss a response in detail, please create a new thread with a link to the reply in the body using the :link: icon to save filling the AMA thread.

2 Likes

One more for the assistants API. It would be really great to have the realtime api able to interact with assistants. That would give really cool and tailored interactive scenarios to users

2 Likes

Where is the code that Sean DuBois showed for weRTC client?

1 Like

It’s something we care about! Giving the model more context and examples is a great way to get smarter responses. Nothing to announce just yet but stay tuned in 2025

4 Likes