Introducing Predicted Outputs

Dramatically decrease latency for gpt-4o and gpt-4o-mini by providing a reference string.

Speed up:

  • Updating a blog post in a document
  • Iterating on previous model responses
  • Rewriting code in an existing file, like with Exponent in this video, which saw a ~3X speed-up

Get started with our docs.

18 Likes