Where type(client) is <class 'openai.OpenAI'>. This runs properly otherwise but does not trigger a APITimeoutError exception after running for 5+ seconds.
On the other hand, as evidence that this isn’t a PEBCAK error, I am properly seeing openai.APITimeoutError when using client.chat.completions.create, and which properly honors its timeout argument in tests.
There shouldn’t be too many delays in the actual network requests of assistants since all the data is at the ready, except for where it actually returns nothing to the open connection.
Here’s chat code I verified won’t tolerate a one-second delay between streaming chunks, but will not timeout as long as the tokens keep flowing and the response is immediate. At the edge of breaking (ask it for something copyrighted for a delay…) You can just drop this httpx timeout spec in with preposterously low limits and see if it is still of no effect in assistants calls.
from openai import OpenAI
import httpx
client = OpenAI(max_retries = 1, timeout=httpx.Timeout(0.1,
connect=1.0,
pool=1.0,
write=1.5,
read=0.8)
)
example_base64 = 'iVBORw0KGgoAAAANSUhEUgAAAIAAAABACAMAAADlCI9NAAAAGXRFWHRTb2Z0d2FyZQBBZG9iZSBJbWFnZVJlYWR5ccllPAAAAAZQTFRF////MzMzOFSMkQAAAPJJREFUeNrslm0PwjAIhHv//09rYqZADzOBqMnu+WLTruOGvK0lhBBCCPHH4E7x3pwAfFE4tX9lAUBVwZyAYjwFAeikgH3XYxn88nzKbIZly4/BluUlIG66RVXBcYd9TTQWN+1vWUEqIJQI5nqYP6scl84UqUtEoLNMjoqBzFYrt+IF1FOTfGsqIIlcgAbNZ0Uoxtu6igB+tyBgZhCgAZ8KyI46zYQF/LksQC0L3gigdQBhgGkXou1hF1XebKzKXBxaDsjCOu1Q/LA1U+Joelt/9d2QVm9MjmibO2mGTEy2ZyetsbdLgAQIIYQQQoifcRNgAIfGAzQQHmwIAAAAAElFTkSuQmCC'
system = [{"role": "system", "content": "You are a computer vision assistant"}]
user = [{"role": "user", "content": [{"image": example_base64}, "Write poem based on image."]}]
chat = []
while not user[0]['content'] == "exit":
response = client.chat.completions.create(
messages = system + chat[-10:] + user,
model="gpt-4-vision-preview",
top_p=0.9, stream=True, max_tokens=1536)
reply = ""
for delta in response:
if not delta.choices[0].finish_reason:
word = delta.choices[0].delta.content or ""
reply += word
print(word, end ="")
chat += user + [{"role": "assistant", "content": reply}]
user = [{"role": "user", "content": input("\nPrompt: ")}]
Thanks, this is interesting and looks useful. However although I can now easily force other things to timeout using this technique (such as initializing the client) I’m still unable to force a Assistant based on client.beta.assistants.create to timeout during a chat interaction based on client.beta.threads.runs.create.
@Plutes, did you ever get this to work. I, too, have tried the timeout parameter in the .OpenAI call and the client.beta.threads.runs.create_and_poll call, but it never accomplishes anything.
A timeout sent as an API parameter with an openai library is now captured and overrides the client-set timeout used by the network library, either chat completions, or “runs.create” or other methods.
It accepts a seconds parameter.
If you are using a method like create_and_poll, network timeout won’t affect its persistence.
Thanks @_j. My question (and this thread) is about the Python openai library, which includes a timeout parameter in client.beta.threads.runs.create_and_poll that is not being honored.
I understand the situation at OpenAI’s servers, but that’s not really my concern. I want to stop waiting for the run at my end because I’m paying per-instance charges at my own server, and because of UX. The Python open library claims to support that, but apparently it doesn’t work. So I’ll have to implement my own polling loop.
Your expectations obviously don’t align with what the timeout parameter or the library method is meant to do.
The timeout parameter passes a timeout setting to the httpx library, a drop-in for the requests library. It is Python code which is used to make network requests.
Timeout is not going to have the effect you seem to want on create-and-poll, because the polling part is making requests every poll_interval_ms in a loop just to get the status of the run. There is nothing to timeout when the API quickly responds to a “retrieve run” call that the job status is “in progress”, “in progress”, “in progress”…
You can look at the library’s poll() method here, and start hacking, like change the while True directly linked, or have it sleep before it starts polling. :
The polling terminates as a signal that the run is finished or reached a terminal state. If the run terminates when the run is in_progress, the whole point of using the specific method is lost.
Thanks for explaining that. You’re right that my expectations of a timeout parameter in a create_and_poll function would be that it would terminate the function if the specified amount of time passed with no resolution. Unfortunately, there’s no documentation of that parameter that I can find.
I do see that the underlying OpenAI API takes an expires_at parameter, which the Python library might have been able to use for real elegance, going a step beyond, but that parameter isn’t really documented either.
Anyway, here is a simplified version of what I ended up doing:
run = client.beta.threads.runs.create(thread_id=th_id, assistant_id=as_id)
while run.status not in Q_COMPLETION_STATES:
if (time.perf_counter() - time_start) > Q_TIMEOUT_SECS:
raise QuestionTimeout
else:
time.sleep(Q_POLL_SECS)
run = client.beta.threads.runs.retrieve(run.id, thread_id=th_id)