How does multi-turn conversation work in the o1-preview Chat Completion API?

sps · September 21, 2024, 5:59am

Per the docs, reasoning tokens are discarded:

How reasoning works

The o1 models introduce reasoning tokens. The models use these reasoning tokens to “think”, breaking down their understanding of the prompt and considering multiple approaches to generating a response. After generating reasoning tokens, the model produces an answer as visible completion tokens, and discards the reasoning tokens from its context.

Here is an example of a multi-step conversation between a user and an assistant. Input and output tokens from each step are carried over, while reasoning tokens are discarded.

The image visually explains how the input, reasoning, and output processes work across multiple turns within a set context window of 128k tokens, highlighting that the reasoning is discarded with only input and output retained from the previous turn. (Captioned by AI)1700×1234 54.1 KB

Topic		Replies	Views
Efficient stateful completion chatbot API	10	5346	July 9, 2024
Chat completion or completion endpoint for multi turns? API chatgpt	1	2671	January 23, 2024
Getting ChatGPT to Remember Previous Chat Messages Prompting	37	70170	January 29, 2024
Multi-turn conversation using the API (Like the web version of chatgpt) API	5	4934	December 17, 2023
Build your own AI assistant in 10 lines of code - Python Documentation gpt-4 , gpt-35-turbo , chat-completion , python , tutorial	13	67006	December 12, 2023

How does multi-turn conversation work in the o1-preview Chat Completion API?

How reasoning works

Related topics