Can I consider memory in chat completion API requests?

My case: I put big chunk of data (>3k tokens) to chat completion api as separate messages (up to 3k tokens each). Then I want to send several separate questions to that content. Can I save some chat id or whatever to be able to use the originally put content like history for further questions (history to be used to reply them)? At least withing 128k context window.

I saw chat completion object (returned value) contains “id”, however hadn’t find the way to pass it as chat_id to the next call.

The answer is yes, you have to send previous chat to answer the latest question with context of what was recently discussed. The API model is stateless, without any memory. This input context is normally a chatbot history that you manage within your budget and the AI model capabilities, but it sounds like you have a task to perform over and over, not a chat with a user.

You can consider all of the input that you provide the AI in messages (and its responses, either real or simulated) as one big input context that allows answering of the final user input.

Seeing it in that light, it may be better to refine what you want done into one huge chunk of instructions in a system message, unless you are specifically using the user and assistant messages as a way to train the AI how to respond for a particularly unteachable job (multi-shot).

Assistants, in contrast to chat completion where you have full control, has a server-side “thread” that keeps conversation and sends the calls to the AI model for you; however you cannot reuse the same chat point again and again. You would have to do some convoluted reproducing new threads with all the turns anyway, or delete the latest message and responses so that you can replace them.

1 Like