HTTP timeout error at the maximum token limit

I am currently developing a desktop application for authors, editors, and translators. The prototype has been successfully developed and is ready for launch, but we have encountered significant issues with the API when working with large text contents.

We have been dividing these large contents into chunks sized according to the maximum token limit of the model, and then calling CreateCompletion with the maximum token limit of the model. This approach has been successful when testing with smaller text samples. However, upon scaling up to text chunks at the maximum token limit, we have been experiencing issues. Instead of receiving a response, we encounter an HTTP timeout error. Reducing the token size again to a much smaller number is rectifying the problem, and we do receive a response. However, this solution is not feasible for our application given the nature of the contents we deal with. We are eager to understand and resolve this problem and would greatly appreciate your guidance.

1 Like

Number of tokens include prompt +chat history + question. If it exceed that it will hit a limit. Also look if you got special chars on your text. Which text completion model are you using

thanks - of course I know and I did. There is no error message with that issue as you would get when out of maxtoken limit. In fact a few minutes ago I found the reason. When you process content with maxtoken size the server time takes so long for processing that either you get http timeout or you assume something went wrong and you kill your app in the debugger. I switched to streaming - which is very slowly - but you see the text coming in and you are more patient. 7 pages and 6000 words took about 10 min.

1 Like