That seems as if you would be better using chat completions method for communicating with AI.
The code is more simple, you can get immediate words to read, and you also have control of the length of the old conversation that is sent each time.
Here is a link to small example code for one user with Python. You will be able to see how fast the AI can produce language. The amount the conversation history can grow is limited by number of turns.