Most of the time when the response is huge (more than 15-20 lines), I don’t get the full response in the application. I have to ask AI to send me the missing part which sometimes is not fully received. This can be improved
Why would you need such a long response? maybe giving more context of your app may be useful. Ive found that even you set your “max_tokens” doesnt assure you that the total of your text prompt+response does not always sum up to 4k tokens, Ive also seen that given the fact that is frequent to have less than 4k tokens prompt+response, the proportion is almos always inverted: If the prompt is 2/3 of the total text then the response will probably be the 1/3 left, and viceversa if the response was 2/3 of the text it probably means the prompt was the 1/3 left of the 4k-3k total tokens
It happens to me when I ask it a coding question and it attempts to show me an example code and it just stops randomly. But each time I told it that happened and to continue it completed the code successfully.
I never tried it myself but asking it for something with the keyword “programmaticaly” may work