How does multi-turn conversation work in the o1-preview Chat Completion API?

jarrelscy1 · September 21, 2024, 1:04am

Previously when using the Chat Completion API, to support multi-turn conversation, we would maintain the state of the conversation in our app and replay the ‘assistant’ messages.

With o1-preview, given that we don’t get access to the reasoning tokens, what is the best practice in maintaining the conversation state? Do we replay just the assistant message without the reasoning token? Would this reduce the accuracy of o1-preview, or lead to unnecessary tokens being generated in the second turn of the conversation as o1-preview might have to ‘rethink’ through the problem again?

_j · September 21, 2024, 2:33am

You have to just accept that you can only pass the last final AI response. Unlike ChatGPT currently, the stateless chat completions AI will never see past reasoning context until you are also able to view it, or can link in a chain of past completion IDs as context, neither which seems likely.

sps · September 21, 2024, 5:59am

Welcome @jarrelscy1

Per the docs, reasoning tokens are discarded:

How reasoning works

The o1 models introduce reasoning tokens. The models use these reasoning tokens to “think”, breaking down their understanding of the prompt and considering multiple approaches to generating a response. After generating reasoning tokens, the model produces an answer as visible completion tokens, and discards the reasoning tokens from its context.

Here is an example of a multi-step conversation between a user and an assistant. Input and output tokens from each step are carried over, while reasoning tokens are discarded.

The image visually explains how the input, reasoning, and output processes work across multiple turns within a set context window of 128k tokens, highlighting that the reasoning is discarded with only input and output retained from the previous turn. (Captioned by AI)1700×1234 54.1 KB

Starwave · September 21, 2024, 6:16am

I don’t see a difference in how multi-turn conversation would work between o1-preview or gpt-4o. For example I save all the user and assistant messages in a database and I pass them all along in new requests. I would not really want to store or pass any reasoning tokens anyway even if we had to them, but regardless they are not needed for multi-turn conversations.

Topic		Replies	Views
Chat completion or completion endpoint for multi turns? API chatgpt	1	2683	January 23, 2024
Getting ChatGPT to Remember Previous Chat Messages Prompting	37	70427	January 29, 2024
Efficient stateful completion chatbot API	10	5402	July 9, 2024
How can I maintain conversation continuity with chat gpt api? API api	9	31980	April 16, 2024
Multi-turn conversation using the API (Like the web version of chatgpt) API	5	4975	December 17, 2023

How does multi-turn conversation work in the o1-preview Chat Completion API?

How reasoning works

Related topics