Hello, I am looking to implement a program that allows conversing with a character who is played by the ChatGPT API.
I am wondering, is it possible to reference a past text completion request (I see each request has an id) to reply to the chatGPT reply and get another reply back with all the context of the previous request. Or is the suggested way to do this to just send the conversation history on each new request to keep the conversation going?
If the second option is the case, what’s the input limit for a prompt? That way seems wasteful of tokens, but if it’s the only way then it can work.
2 Likes
You might be interested in something that I did around Christmas time
- somewhere maybe 20 minutes in or so, I explain how the whole thing works
I’ve been toying with the idea of opening this up for others to make use of.
2 Likes
Thank you for responding, I finally had some time to look at this stuff again. It’s actually about 35 mins in you show the prompt design. Very helpful. It seems each request is separate and each response is given back to the AI in the next prompt, with some summarizing also done by chatGPT.
This is the solution I considered, hoping to find something more cost efficient, but it seems there is no way to reference previous prompts.
My concern is the cost of repeating the same text every prompt, but it seems to be the only option.
1 Like
That would seem to suggest that the repetitive replies are a way of making extra money - not too ethical, if that is the case.
Technically all of this isn’t done through chat GPT, the API for that isn’t available yet; this uses the text completion endpoints.
They haven’t specified if the chat GPT API endpoint will be able to maintain a context or not, but if you query chat GPT and inquire as to how IT works and how it maintains its ongoing conversation context… It will lead you down this path.
There is definitely a size limit problem, as you won’t be able to continue the conversation indefinitely. At least not in the sense of sending every exact prompt back through the system again - eventually you’ll run out of room.
Which means at some point you got to bring in some sort of summarization aspect, to condense the information and shrink the size of the text.
It’d be really interesting to see system architecture designs from people in the past, who have done things like this before, how they’ve managed a system like this.
Yes, and as the conversation grows longer, the GPT hallucinations increase because the prior context is lost / truncated, etc.
ChatGPT and the OpenAI API public beta releases have a lot of limitations, and I guess paying OpenAI customers will be about to increase the token limitation (critical for chatting) at some point in time?
The current offering has been described by OpenAI (paraphrasing) as a “marking, research beta” so its very early in the production life cycle.