I am experiencing difficulties in generating long articles using GPT within the ChatGPT interface (https://chat.openai.com/). When I input the same instructions as I do in the API, the website produces long responses that get cut off due to the token limit. I then prompt it with “continue” to generate the rest of the article. I have tried this multiple times and consistently receive long articles that require me to use “continue”. This exactly what I want, and I want to replicate this behavior when making API calls to gpt-3.5-turbo. However, with the API calls, it seems that gpt-3.5-turbo attempts to condense the entire article into the remaining space, resulting in a loss of information as it tries to summarize the content to fit the constraints. I have encountered this issue numerous times, as the model always tries to fit the entire article within the available space.
How can I replicate the long article generation behavior I experience on the ChatGPT website when using API calls to gpt-3.5-turbo, so that I can generate extended articles without losing information due to summarization?