penAI API responses are extremely slow (30–40s) even with server setup

Hi everyone,

I first created an Assistant on the OpenAI platform and integrated it into my application. Since the responses were extremely slow when calling the API directly, I decided to build a Python project (FastAPI) and host it on my OVH server to try to improve the performance.

Here’s what I did:

  • Created an Assistant on the OpenAI platform.

  • Connected my app to it using the OpenAI API.

  • Hosted the Python project on my server to handle the requests.

However, the latency is still very high:

  • On my server, responses take about 20 seconds.

  • If I call the API directly, it can take around 40 seconds.

I also tried a second solution (optimizing the polling loop, reducing delays, etc.), but the performance is still far too slow for real-time use.

This makes the Assistants API very difficult to use in production apps where users expect answers within a few seconds.

Has anyone else experienced this? Is such latency normal for the Assistants API, or is there a recommended way to make responses faster?

Thanks in advance for your insights!