Does threads only truncate previous messages, or does it chunk the chat history when there are too many?

Does threads only truncate previous messages, or does it chunk the chat history when there are too many?

1 Like

(I work on the Assistants API)

Currently, it only truncates the thread when the chosen model’s context window is maxed out.

2 Likes

Still have some questions:

  1. Does truncate mean just deleting overflowed previous messages or doing summary for overflowed previous messages?
  2. Whichever approach truncate strategy adopts, does developer can still get truncated (overflowed) previous messages through Assistants api?

Thanks !
Xuefeng.

Hey, I have a question for you! So the Threads section disappeared from the playground area for assistants. I had a great long conversation with a new assistant earlier that had a ton of info I wanted and I was looking forward to coming back and continuing the work with it. I come back and I cannot find a way anywhere to access that last conversation and it disappeared because I closed my tab but I had no idea it would just disappear. Can I not access that conversation anywhere??

Hi, had a couple of questions:

Is it just a straight truncation or is there any condensing/summarising being done? Could you describe how messages are handled when maxing out the context?

And, is it possible to rebuild the message history in the thread with summarised/condensed messages to get more out of the context window? I’m doing this with a personal chat client I made and it is very powerful.

Thanks for all the work! This round of updates is spectacular, as usual.

(1) We stop showing the older messages that do not fit in the context window
(2) Yes, the previous messages are not deleted. They are just not shown to the model.

FYI We updated our docs to answer some questions related to this https://platform.openai.com/docs/assistants/how-it-works/context-window-management

We temporarily disabled access for all users, and are working on bringing this back.

Currently, it is just truncation. See our updated docs for the truncation strategy https://platform.openai.com/docs/assistants/how-it-works/context-window-management

Not in an automatic way. One way you can do this is to use the Chat Completions API to summarize, then add it as a message to the thread.

Thanks.

Not in an automatic way.

I aplogise for not being more clear. I would like to be able to manually insert and remove message from the thread. Is this possible?

Do you know what the roadmap and timeline for truncation strategies is?

Thank you.

See our updated docs for the truncation strategy

I don’t consider that documentation. It reads more like a note about things that may or may not happen with a few random vague technical details sprinkled in.

I’m curious why the chat history for the thread is just truncated instead of stored in a time-based vector database (e.g., timescaledb), which would allow for storage of longer chat histories and also biased retrieval to more recent prompt-answers in the thread.
One could also include LLM-generated “reflections” at after each prompt-retrieval-answer step (e.g., “was my answer effective?”, or for a goal-oriented agent: “based on my last action(s), I predict that I am closer/further way from my goal”) that could also be stored in the time-based vector database for later retrieval.