currently using the api to generate some very long responses. sometimes, longer than the (2k) max amount of tokens (gpt-35-turbo). when the response is too long, it stops there - is there any way to make it continue from where it left and get the full response (like the chat UI where you press continue)?
i’m using python to run the api
+1 for would be cool. Though given the 32x token size of newest models compared to the above, I doubt they’re going to prioritize it
I imagine the ChatGPT UI press continue is a prompt that says something along the lines of “The previous output was cut off. Please continue where you left off on the last output based on our previous messages.”
You could test a follow-up prompt along those lines and see if it works.
The API is stateless and memoryless. When you hit “continue” in ChatGPT, the whole context is re-sent to the model.
To continue in the API, you’ll need to do the same thing.
Well, I am way late to this thread, but the response JSON has a finish_reason field in it. What I do is check if the reason is length, and if so, I just the model last few hundred worlds, and tell it to continue from there.