Using Asynchronous Client with AsyncOpenAI


In the OpenAI github repo, it says that one could use AsyncOpenAI and await for asynchronous programming. Could someone please elaborate on these two questions:

  1. Given the following code, if all the code we have is calling different OpenAI APIs for various tasks, then is there any point in this async and await, or should we just use the sync client?

Screenshot 2024-02-12 at 8.22.28 PM

  1. Given the following steps mentioned here for creating an assistant:
    does async client make any sense given that there is an order in these steps or not?

I would really appreciate your response and clarification. Thank you.


I guess in this case it’s a matter of taste. Async allows you to manage asynchronous tasks, such as for example when you want to send off a bunch of http calls in paralllel and then gather the responses later.

If everything you do is synchronous, then you’re right, it doesn’t matter! Might as well use the synchronous methods.

1 Like

Is there anyway that could accelerate the async openai call? I am currently making 400 api calls but it takes over 4 minutes sometime, the best one takes 80s, is there any way I could get the results within 10 seconds?

With streaming? :thinking:

If you are in payment tier 1, having paid less than $50 into your account and not waiting a week to make another small payment for recalculation, OpenAI has a new “feature” where you get slow token output. You have to pay up in order to play at full speed and priority.

GPT-4 models also are the slower ones, they spend more time thinking.

Streaming won’t increase the response completion time, but you can write threaded parallel code that will monitor for timeouts, and restart if no tokens are received in 10 seconds, for example. That ensures nothing is hung up in your queue.

Thanks, tried, still in the same level, this time for 400 api call still cost 180s

The benefit to streaming is that you start getting the response sooner. It will still take the same amount of time to generate a response.

I’m not a Python developer so not sure how Python implements synchronous web requests under the hood but I’m assuming it blocks the running process so the down side is your server can’t receive any new incoming requests until your outgoing request completes.

Synchronous anything is generally bad. Async/await makes performing async operations trivial so you should just learn the pattern and use it.

Remember to use: await asyncio.gather()
After all the call requests are gathered, they will be sent at the same time.

Thanks to this thread and also this GitHub issue (openai/openai-python/issues/769), I managed to find a way for FastAPI, openai assistants api, and openai.AsyncOpenAI client to work together.

I use openai assistants for retrieval. I needed to implement a fully asyncronous FastAPI solution on top of openai-api. It took me a couple of weeks to debug all the async-related errors. Thanks everyone who shared all the information on AsyncOpenAI.

In case someone needs the code it is here - [simple-async-openai-assistant]

@giovannisimo Can you repost that link?

Hi @stephen.pasco

I think the forum blocks direct links… it is on GitHub: ivanzhovannik / simple-async-openai-assistant

@giovannisimo Thanks. I’m taking a look. Appreciate you sharing.

1 Like