Text Generation Persistence

I’m trying to maintain persistence when making calls to the text generation endpoint.

Similar to chat.openai.com, where a user is able to open multiple chat threads, where each thread maintains it’s own history and behaviors. How is this achieved?

I guess what I’d expect, is having some kind of unique ID that I’d pass to the text generation create endpoint, so it maintains persistence on it’s end, but this doesn’t exist, after a quick scan.

I get the feeling I’m missing something obvious here, so if anyone can throw me a bone, I’d appreciate it.

The only persistence is has at this time is the context window… so, between 2096 and just over 8k tokens, depending on the particular model you’re using.

So, for persistence “or the illusion of it,” you’ll need to append the entire conversation in the prompt when you send it back.

There’s quite a few examples in various languages on GitHub.

Let us know if you run into any problems.

2 Likes

So is this what occurs on the backend of chat.openai.com? Each time I enter a prompt, it’s sending the entire chat history?

ChatGPT is a big of a black-box (we don’t know exactly how it works), but that’s likely at least some of what they’re doing. There’s a chance ChatGPT also has a larger context window… or something else we don’t know about.

But yeah, with the current API, this is the way…

3 Likes

Be warned.

I implemented saving the chat log into a text file.

Then send it as part of the prompt.

It does work, but get’s very expensive very quickly.

I deployed to a website a chatbot and after hours of use by a single user (he was playing with it at 3am so I wasn’t aware) he used up all my free credit form OpenAI in a matter of hours.

I used a lot of queries before that using unit tests, so I was overconfident.

I didn’t realized the model grew too much and sub sequent promt’s grew too much.

When I reached the llimit I got this exeption that I used to know when to move older chat log entries to a separate file.

Error: This model's maximum context length is 4097 tokens, however you requested 4809 tokens (4529 in your prompt; 280 for the completion). Please reduce your prompt; or completion length.

1 Like