Should I re-open the client object for every request or store it?

I am building a django webapp where multiple users can interact with the assistants api (i.e. separate threads and assistants).

Right now for each request the procedure is roughly:

  • Re-authenticate to get a client object (openai = OpenAI() )
  • create a run based on a stored thread, assistant and vectorstore and stream the result to the user.

I feel the application is a bit slow and I wonder if this is the right approach. Alternatively I could store the client object in memory and deal with re-authentication after timeouts using exception handling.

I guess everyone has to make this decision at some point but I have been unable to find suggestions, so I wonder if anyone else have input?

Creation and instantiation of an OpenAI (or Client, or AsyncOpenAI) class object doesn’t “authenticate” or make an API call.

You can reuse the idle client. Just be aware of its auto-closing of transports and that manual close() would kill anything still running. Besides full asyncio and better queueing, I don’t have strong clues what would work for you (progressing in code technique up to pinning affinity CPU thread to NIC adapter IPs )