Why does the chat completions endpoint not use a session ID?

Hello! I am a developer. I have written a WordPress plugin that allows the user to send their blog post to the text completions endpoint and ask questions about it. For example, “suggest a title for this blog post”.

I’d like to pivot this to begin using the chat completions endpoint so that the user could have an ongoing dialogue. For example:

  • suggest a title for this blog post.
  • no, make it shorter.
  • ok, make it rhyme too.

I’m looking at the docs here:

I was expecting that this endpoint would operate based off of a unique ID for that session, where OpenAI would “remember” the conversation, stored against that hypothetical ID for some time or memory limit.

Am I correct in saying it’s actually nothing like that? Actually you pass it an array of messages? So, in practice, an API client would do something like…

  1. Submit a user message to the API, and store it in an empty array.
  2. Get the response from the API, and add it to the array from step 1.
  3. Add the next user request to the array, and submit the whole array to the API.

… and so on.

Is that an accurate summary?

It seems like the array could grow very large for long conversations.


Yes, that’s accurate. You need to have your own database to store the messages.

Keep in mind there’s a token limit of 4k. So after that you will essentially be cutting off the initial conversation


To be accurate, you can use a number of methods to manage the token count; truncating from the beginning is but one strategy.



1 Like

I figured that I just kept missing the part in the documentation where that was explained, but I see now that it’s just not in here. I’m going to take a wild guess and say that this is part of regulating the widespread use of the API. Since having to create the work around on your own is going to take some time and thinking.

It’s not difficult to do. So far I have used the database approach and also tried reading Assistant back a dictionary containing the previous responses (while not going over the token limit).

Yes, the current OpenAI API chat method is very inefficient and forcing the client to continually resend the prior chat messages back to the server boggles the mind.

Hopefully, when OpenAI solves their growing pains and scaling issues then they will update the API with session management and end this practice of forcing clients to resend (and pay for) the same chat messages repeatedly.