Not sure why, but the chatbot always consistently hangs at 32 to 40 minutes into the conversation. When it hangs, it gets into a routine where it delivers repeatedly the previous response. I can only end this repetition loop by re-booting.
Regarding token length, I have experimented with topic, temperature, and token length (nearly eliminating token length to next to nothing) That said, check out these usage stats. In spite of my keeping prompts to a minimum, the token length in prompts seems excessive. If 1,000 tokens is about 750 words and a paragraph of 34 words = 35 tokens there is absolutely no way I submitted a prompt of 18,000+ tokens and 23K tokens and 10K tokens. I would be impossible to even carry a conversation in such a manner.
5:25 AM
gpt-3.5-turbo-0301, 7 requests
18,657 prompt + 290 completion = 18,947 tokens
5:30 AM
gpt-3.5-turbo-0301, 7 requests
23,058 prompt + 327 completion = 23,385 tokens
5:35 AM
gpt-3.5-turbo-0301, 3 requests
10,964 prompt + 101 completion = 11,065 tokens.
As a test, I just sent a message “Hi Chatty, this is a test prompt.” The response was "Hello there! How can I assist you today?
gpt-3.5-turbo-0301, 1 request
200 prompt + 10 completion = 210 tokens
This seems normal – what could have been happening?
2 Likes
Notice it says 7 requests here… it seems like the big ones are where you append old messages maybe? What’s your cut-off point for that?
This is a plugin that I purchased off Unreal Engine marketplace. I suspected it appended old messages, which gives the chatbot the semblance of memory.
I do not know what the cut-off is for this. But it seems like capping the chat might solve my problem when I hit the rate limit ceiling (assuming that is causing the glitch), it would have deleterious effects on the chat experience.
But you’re right–that cut-off is likely in the plugin’s source code somewhere. I didn’t see an adjustable cut-off variable in the dash panel. That said, once I locate it, I wonder what the sweet spot would be.