API gpt-3.5-turbo loses speed

I got on my site ki-urlaub .de/ki the API 2x already integrated. In the first weeks it worked wonderfully. The answers came after 5 seconds for simple questions and after about 10 seconds for complex topics. Then suddenly the response time became drastically longer or the AI only provided answers to the simplest questions. Queries are very often aborted. It seems as if “deposits are collecting in the veins” - until the API “dies”. Do any of you know this? Do you have a solution for this?

I would advise implementing streaming responses, with that system you will get your first batch of reply tokens almost right away, similar to how ChatGPT works.


Foxabilo is right, what’s more is that you can adjust your timeout or processing time limit on your web server in case of a long output.

Thank you!!. But I don’t know how to add this to the API-Code. And there is no “timeout” in the code

:smiley: The OpenAI cookbook! Here you go examples/How_to_stream_completions.ipynb

@Foxabilo: wow! Thank you! When all this is possible for you, the question is whether you do not also know how I can solve my biggest problem with my page: I would have liked to add to the suggestions of the AI (hotel names) links that lead directly to bookable suggestions (supplemented by various parameters). Do you think you have a solution for that as well. Please don’t get me wrong, you wouldn’t have to do this for free of course. Do you see a way?

Sure, I have Messaged you to talk commercial terms in private.

The basic method that I would use is a streaming text renderer that can have custom features such as links and other functional logic included, I have built such a system myself, obviously I would use that, but there are bot creation tools out there like bot press and others, you pay a fee to use those and then you build your logic into their handling system.

1 Like