Clarifying Thread Persistence

Hi. Question about Threads. I’ve got Assistant creation up and running via the API, and I’m curious if there is any persistence in a Thread. I can create one, get and store its ID for User’s continued use at a later date.

Will that Thread maintain a list of messages exchanged between Assistant and User? If so, for how long? Indefinitely?

Or will I need to initialize it with a record of exchanges I maintain, and then continue.

Also, not clear to me. Currently, I send all prior prompts and completions with each new prompt, in order to maintain the chat context. Is this no longer required when using an Assistant via a Thread? Does the Thread keep that list for context, such that I only. need send the latest User message/prompt rather than the entire conversation to that point?

Hopefully, I’ve described the questions well enough. Thanks for any advice on this. I can experiment to determine the answers, I suppose, but perhaps someone here can answer directly. Thanks!

Ron

Yes. It’s important to store the thread ID. For whatever reason there’s no endpoint to get the list of existing threads

Yes. Not known.

For some reason only the “user” role can be manually added to messages. So you cannot modify or initialize a thread with an existing conversation. You could try and inject a summary into the system prompt. Feels hacky though.

Yup. Not required.

2 Likes

Thanks, fellow Ronald! Some experimentation still needed, but these answers will cut me closer to the chase.

The Thread lifespan remains key (at least for my application, which requires persistence of context and exchanges over time). If the Thread maintains state then at least looks like I will not have a need to initialize with an existing conversation prior to continuing with a new message.

Not recapitulating the entire conversation with each new message should result in a solid optimization both for speed and expense.

Thanks again.

Ron

The documentation states: " Assistants can access persistent Threads. Threads simplify AI application development by storing message history and truncating it when the conversation gets too long for the model’s context length. You create a Thread once, and simply append Messages to it as your users reply."

1 Like

Just to confirm: you still need to pay for the whole conversation as if you are sending the full thing every time. The main benefit (which kind of isn’t) is that the conversation is automatically truncated for you and I guess you don’t necessarily need to maintain the full conversation.

Agreed. I’m surprised there isn’t any mentioning of thread lifespans. It could be that they are still gathering information and haven’t made a decision yet.

For another Ronald, no problem :laughing:

2 Likes

Great, thanks! Missed that key bit of information.

cc: @anon10827405

1 Like

An old thread, but I have the same questions. Is there documentation to support this claim that the entire conversation is tokenized and charged each time a new message is appended to an existing conversation?

If the conversation grows too large, larger than the context window of the model being used, then obviously the entire conversation history cannot be passed to the model.

The Assistants API automatically manages the truncation to ensure it stays within the model’s maximum context length. So your potential to be billed can be the maximum the model can accept, multiplied by the number of tool iterations that might be done with that growing input context length before you get a response.

truncation_strategy: last messages (int) is a run API parameter that can limit the initial input to a particular number of turns. You don’t get a token count control to discard old messages - yet OpenAI obviously has this so the model doesn’t get overloaded.

The only other control is to abort the run and get nothing for the amount that was billed if the model goes over some limit of input or output.

1 Like

Short answer: yes.
The full answer depends on your truncation strategy.
Since you asked for the docs, here is the link:

https://platform.openai.com/docs/assistants/deep-dive#context-window-management

1 Like