Hi everyone,
I’m working on a FastAPI server that calls an OpenAI assistant via asynchronous endpoints. My current approach is to run something like this inside a function (let’s say get_response):
attempts = 1
while attempts <= max_attempts:
thread = await client.beta.threads.create()
prompt = sample_prompt + output_format
message = await client.beta.threads.messages.create(
thread_id=thread.id,
role="user",
content=prompt
)
run = await client.beta.threads.runs.create(
thread_id=thread.id,
assistant_id=os.getenv("OPENAI_ASSISTANT_ID")
)
# Poll until the run completes
while run.status not in ["completed", "failed"]:
run = await client.beta.threads.runs.retrieve(thread_id=thread.id, run_id=run.id)
await asyncio.sleep(2)
if run.status == "failed":
# Handle failure
raise Exception("Assistant run failed")
# Retrieve messages and usage
completed_run = await client.beta.threads.runs.retrieve(thread_id=thread.id, run_id=run.id)
message_response = await client.beta.threads.messages.list(thread_id=thread.id)
messages = message_response.data
# Do something with `messages`
I then call get_response() in different threads simultaneously, with potentially up to 5 threads at once.
My questions are:
Is this a good approach for scaling to handle hundreds of users? I’m mixing asynchronous calls (await client.beta.threads…) with multiple threads. Is it best practice to use threads here, or should I rely solely on async event loops and avoid explicit threading?
Would increasing the number of event loop tasks (e.g., multiple asyncio tasks) or relying on the server’s concurrency model (such as multiple workers from Uvicorn/Gunicorn) be a better approach?
What patterns or architectures do people recommend when calling OpenAI assistants (or similar APIs) at scale via an async web framework like FastAPI?
I’m a bit new to this and just want to ensure I’m setting things up in a way that will scale well and not cause hidden issues (like blocking I/O or unnecessary overhead).
Any advice, best practices, or insights would be greatly appreciated! Thank you.