Introducing the Responses API

The responses API doesn’t have threads, a single list of messages and internal calls that you can add to and run.

Instead, for server-side chat history state, it stores the request ID and its completion contents with every API call you make.

There is also an API parameter for “instructions” if you want the first system message to be separately changeable instead of being a part of the past messages sent.

To use this, (in the manner that many novices would expect API calls to somehow “remember” exactly who they are), you take the most recent response’s ID and send it back as the parameter for previous_response_id. When you do that, the past chat and responses you specify is reused, and you only need the newest user role question as a message in input.

There is still no cost management: the chat length will grow to the model maximum, where it will return an error, or you can set truncation:auto to start deleting old messages at the model’s maximum input.

2 Likes