Is GPT3.5 API designed to generate responses to fit within the remaining token space?

I can’t get gpt3.5-turbo api to generate long articles as in the ChatGPT interface ( When I input the same instructions as I do in the API, the website produces long responses that get cut off due to the token limit. I then prompt it with “continue” to generate the rest of the article. I have tried this multiple times and consistently receive long articles that require me to use “continue.”. This is exactly what I want, and I want to replicate this behavior when making API calls to gpt-3.5-turbo. However, with the API calls, it seems that gpt-3.5-turbo attempts to condense the entire article into the remaining space, resulting in a loss of information as it tries to summarize the content to fit the constraints. I have encountered this issue numerous times, as the model always tries to fit the entire article within the available space.

How can I replicate the long article generation behavior I experience on the ChatGPT website when using API calls to gpt-3.5-turbo, so that I can generate extended articles without losing information due to summarization?