Hi everyone,
I first created an Assistant on the OpenAI platform and integrated it into my application. Since the responses were extremely slow when calling the API directly, I decided to build a Python project (FastAPI) and host it on my OVH server to try to improve the performance.
Here’s what I did:
-
Created an Assistant on the OpenAI platform.
-
Connected my app to it using the OpenAI API.
-
Hosted the Python project on my server to handle the requests.
However, the latency is still very high:
-
On my server, responses take about 20 seconds.
-
If I call the API directly, it can take around 40 seconds.
I also tried a second solution (optimizing the polling loop, reducing delays, etc.), but the performance is still far too slow for real-time use.
This makes the Assistants API very difficult to use in production apps where users expect answers within a few seconds.
Has anyone else experienced this? Is such latency normal for the Assistants API, or is there a recommended way to make responses faster?
Thanks in advance for your insights!