API timeout after inactivity (python)

When my python program doesn’t use the API for couple of hours or days, the next time it does, very often the API time-outs by APIConnectionError Exception after minutes of inactivity.
After that, it is back to normal and it responds swiftly.
If I restart the program after such delay before using the API, it typically answers swiftly as well.

Happens both to chat completion and transcribe by whisper-1.

Does anybody encounter such behavior as well?
Why is that?
Any idea how to fix it?

Thanks, m

The nature of open ports and connections to services is a transient one, many systems all along the connectivity chain will close inactive sockets after various waiting periods.

With this in mind it is often useful to ensure your application creates new connections if it detects an inactive one prior to use. This can be done with the use of try/except blocks to detect a failure state and act on it accordingly.

Feel free to post your code here if you wish and we can wee if there might be any obvious issues.

Thank you. I expect this can really be the problem, and tried to counter that by re-assigning openai.api_key as shown here:

But regardless it’s being called, the problem remains.
Is there a different way to do that?

I do not believe the issue is with the API key, rather when you make a call to connect to the API, if you are using the prebuilt OpenAI library then that should all be handled for you, what does you API calling code look like?

I expected that the reassigning the key resets the connection. But it’s not documented, so I don’t know. I did not find how to reset the connection.
Trying to import openai again in python does not work either.

Here:

and here:

These are the calls.

Both lines are those that time-outs after some inactivity. First is text input, second is voice input.

Those raises the APIConnectionError, but the main problem is that that exception can take minutes or even 10 minutes.

When called again by catching the exception, it works, but the problem are the many minutes waiting for the exception.

Ok, going back to first principals, if you know that an error may occur that has a long period to show itself, then you can make the assumption that the error has occurred. In which case you should remake the connection every time the call is made rather than waiting for the issue to occur.

That is, I would try changing the connection setup to occur every time you make the API call, and then close the connection after.

So the new functionality would be

Def WisperCall
    Make Connection
    Make API Call
    Close Connection

Do you see any problems with that?

The problem is it’s not documented how to make the connection in python.

The reference documents only the import, assigning the key and the api calls in python.
Nothing else. The connection is being made automatically, it’s not documented how to influence it.

Fair point. So in theory the API call should be doing essentially what I just mentioned.

What is a little strange, is I have never encountered this issue and I have implemented a few Whisper systems for various clients. It is also quite an active endpoint, you’d think this would have cropped up before now.

Ok, I’ve add the Whisper and Bug tags to this post. I’m not convinced it is a bug at this point, but some connection issue between you and the endpoint, what confuses me is why it should take 10 mins to get a failed response.

Do you have any activity logs that show this behaviour?
If you have exact error return codes that would also be useful.

OK, I made a trap and waiting for it to happen.
Let’s see if there will be any other details added to the exception.

1 Like

It got better when I lowered the limit to re-assign an API key to 5 mins (was 60) despite it should get triggered in both cases.

Now I only occasionally get the next error, but that became rare (once in couple of weeks):

APIConnectionError: Error communicating with OpenAI: (‘Connection aborted.’, RemoteDisconnected(‘Remote end closed connection without response’))

I cannot explain how this change could affect the performance though. It might also be a coincidence as well… So just reporting the investigation.

1 Like

Consider that response (or transcript) is a generator object that will emit subscription events as they are received when streaming, so it is a bit peculiar.

You can clean up through subclassing with new methods, resetting a timer each call until it becomes inactive for a bit and the time invokes a signal, etc. Or just put in your linear script an ultimate “bye object”:

response = None
del response

One might consider spawning threads that don’t block code, or monitoring with asyncio, depending on if you are nowhere near a machine’s port count with the concurrency.

Random unverified ChatGPT code example based on my specifications:

import threading
import openai

class OpenAIChatCompletionThread(threading.Thread):
    def __init__(self, params):
        super().__init__()
        self.params = params
        self.response = None
        self.finished = threading.Event()

    def run(self):
        try:
            self.response = openai.ChatCompletion.create(**self.params)
        except Exception as e:
            print("Error:", e)
        finally:
            self.finished.set()

    def get_response(self):
        self.finished.wait(timeout=200)
        return self.response

# Example usage
if __name__ == "__main__":
    openai.api_key = "YOUR_API_KEY"

    params = {
        # your parameters here
    }

    completion_thread = OpenAIChatCompletionThread(params)
    completion_thread.start()

    # Do other work concurrently

    # Get the response or handle timeout
    response = completion_thread.get_response()
    if response:
        print(response.choices[0]) # for your own parsing...
    else:
        print("API call timed out or encountered an error.")

    # Rest of your code

My project sometime is to dig into the openai module and close SSE streaming connection and see if it stops the token billing with the whack of a cancel generation button.