Timeout not honored in Assistants Python API

Is the Assistants Python API is not honoring the timeout argument?

Here’s how I’m calling it for testing (normally I use a longer timeout):

return client.beta.threads.runs.create(

Where type(client) is <class 'openai.OpenAI'>. This runs properly otherwise but does not trigger a APITimeoutError exception after running for 5+ seconds.

On the other hand, as evidence that this isn’t a PEBCAK error, I am properly seeing openai.APITimeoutError when using client.chat.completions.create, and which properly honors its timeout argument in tests.

You don’t call with it, you set it.

import openai
client = openai.Client(timeout=2)

or you can pass a httpx timeout object.

client = openai.Client(timeout=2)

It’s not respecting that either.

Other operations using that client are timing out as expected when I set a low value, but not Assistants.

1 Like

There shouldn’t be too many delays in the actual network requests of assistants since all the data is at the ready, except for where it actually returns nothing to the open connection.

Here’s chat code I verified won’t tolerate a one-second delay between streaming chunks, but will not timeout as long as the tokens keep flowing and the response is immediate. At the edge of breaking (ask it for something copyrighted for a delay…) You can just drop this httpx timeout spec in with preposterously low limits and see if it is still of no effect in assistants calls.

from openai import OpenAI
import httpx
client = OpenAI(max_retries = 1, timeout=httpx.Timeout(0.1,
system = [{"role": "system", "content": "You are a computer vision assistant"}]
user = [{"role": "user", "content": [{"image": example_base64}, "Write poem based on image."]}]
chat = []
while not user[0]['content'] == "exit":
    response = client.chat.completions.create(
        messages = system + chat[-10:] + user,
        top_p=0.9, stream=True, max_tokens=1536)
    reply = ""
    for delta in response:
        if not delta.choices[0].finish_reason:
            word = delta.choices[0].delta.content or ""
            reply += word
            print(word, end ="")
    chat += user + [{"role": "assistant", "content": reply}]
    user = [{"role": "user", "content": input("\nPrompt: ")}]

Thanks, this is interesting and looks useful. However although I can now easily force other things to timeout using this technique (such as initializing the client) I’m still unable to force a Assistant based on client.beta.assistants.create to timeout during a chat interaction based on client.beta.threads.runs.create.