Thanks for the code, i’m using the same approach for this problem. And now i’m facing a new problem, its about the max_tokens that chatgpt can handle on their API request(i believe it’s only 4097 tokens maximum). How can i manage to keep this “memory” features now?
Thanks Dent. This is a nice little script to understand prompt engineering. I think there is a slight issue with your code snippet … I think that last line should read: conversation_history = handle_input(user_input, conversation_history, USERNAME, AI_NAME) otherwise it does not retain the history …
Also, it is not clear to me whether there is any benefit in using the openai.ChatCompletion.create interface for this application and building up the messages parameter and adding roles like system, assistant, user.
You didn’t tell which “solution” you are referring to. You didn’t press the “reply” button on the relevant post.
This topic is quite old, starting even before the chat API was released, and has several diversions that are not answering “remember previous messages”.
Here’s a better recent thread to move to:
The conversation length cannot increase infinitely, because the AI model has a limited context length area to supply it past conversation. Management and truncation is required. The maximum compute would actually be in producing a very long output, as the input size has a smaller effect on the processing cost.
ChatGPT has very aggressive minimization of past conversation turns, giving the AI only what is needed to understand the present topic, so an otherwise ambiguous question like “what about the other one?” can be answered.
The text quoted poorly describes what you’d do, the “poorly” starting at calling the API ChatGPT…
The ID returned with a response is just a unique identifier used internally by OpenAI for each API call return. It means nothing to you. If you log it along with your messages, in olden days of actually having support, you might say “my user 35185 made the request ID 385898634, but as I also sent the user name plus that call was first submitted to the moderations endpoint with id 385828531 returned, my account shouldn’t be banned”.
Examine ChatGPT. You have a hierarchy:
Just have a database that records all those. Then for each new message the user would send, include the most recent messages of a chat that would fit in the token budget you have for chat history context to send to the API model.
So, is having own database and sending entire context (truncated to token limit) with each new request the only way with chat completion API? Is there no way to keep history management on OpenAI side other then assistant API threads (OpenAI Platform)?
When it is managed, it is not the maximum context length.
Regardless of whether a model is accessed by you or by assistants, you pay for the input tokens.
The difference is the assistants documentation PROMISES to use the full context with whatever they can get, either from an entire conversation stream, or any documentation up to the maximum, and then iterate calling functions to make multiple bills with that context.