Recommended way to limit the amount of time a Python ChatCompletion.create() runs

greazer · September 14, 2023, 8:54pm

Maybe I’m just missing a concept or documentation, but I’ve had situations where a gpt-4 ChatCompletion.create() response can take multiple minutes to reply, to the same prompt that usually takes 30 seconds. I’m thinking this could be the result of a bug on the API side that sometimes arises.

I don’t see any obvious way to set a timeout on the Python call. I could wrap the call in a thread or another process, but that seems rather heavy.

Is there another way to think about this? Or another suggestion?

wclayf · September 14, 2023, 9:34pm

Either the image here is from a “hallucination” or else their API docs just forgot to mention “timeout” not sure which it is.

BTW: that’s from GPT-4. GPT-3.5-turbo was unaware and gave some different workaround when I asked.

greazer · September 14, 2023, 10:01pm

That looks like what I need, but adding this parameter doesn’t seem to do anything. I added timeout=5 to my call and I’m still getting response times much longer.

greazer · September 14, 2023, 10:02pm

So it’s great that GPT-4 agrees that it seems like a good idea. But apparently it’s not. Otherwise, I’d think it would be there. I feel like I’m missing something.

wclayf · September 15, 2023, 1:36am

Next step I’d do then is just download the Python package itself (like a zip file from GitHub right), then search for that method in it. Should be super trivial to see if there’s any timeout being set, because that method will just be a wrapper around the HTTP call.

_j · September 15, 2023, 2:14am

It looks like, but is not, because the AI makes up plausible nonsense.

The actual parameter that can be passed with the ChatCompletion():

api_params = {
"model": model,
"max_tokens": max_tokens,
"temperature": temperature,
"messages": all_messages,
"request_timeout": 3,
}

It will get you:

“API Error: Request timed out: HTTPSConnectionPool(host=‘api.openai.com’, port=443): Read timed out. (read timeout=3)”

The connection is terminated regardless of if the AI is in the process of answering, with error thrown.

You can also set the variable directly into the library if the parameter method was to be removed:
openai.api_requestor.TIMEOUT_SECS = 3

wclayf · September 15, 2023, 4:46am

I bet you the property used to be named “timeout”, and they renamed it. Or it could be a hallucination from a parallel universe. Anyway, they probably need to update their docs, to include it.

Thanks for looking into the actual code to set us straight!

_j · September 15, 2023, 4:56am

I bet you the AI will just make up whatever it wants, and the current model is poor enough to go loopy on further questions.

sps · September 15, 2023, 5:26am

The timeout parameter is an optional parameter that can be passed to the create method.

Inside the method, the timeout parameter is extracted from the kwargs dictionary using the pop. If timeout isn’t provided, its default value will be None

class ChatCompletion(EngineAPIResource):
    engine_required = False
    OBJECT_NAME = "chat.completions"

    @classmethod
    def create(cls, *args, **kwargs):
        """
        Creates a new chat completion for the provided messages and parameters.

        See https://platform.openai.com/docs/api-reference/chat/create
        for a list of valid parameters.
        """
        start = time.time()
        timeout = kwargs.pop("timeout", None)

        while True:
            try:
                return super().create(*args, **kwargs)
            except TryAgain as e:
                if timeout is not None and time.time() > start + timeout:
                    raise

                util.log_info("Waiting for model to warm up", error=e)

However, after this line:

timeout = kwargs.pop("timeout", None)

the timeout parameter no longer exists in the kwargs dictionary because pop removes the key from the dictionary.

Here's super().create()

@classmethod
    def create(
        cls,
        api_key=None,
        api_base=None,
        api_type=None,
        request_id=None,
        api_version=None,
        organization=None,
        **params,
    ):
        (
            deployment_id,
            engine,
            timeout,
            stream,
            headers,
            request_timeout,
            typed_api_type,
            requestor,
            url,
            params,
        ) = cls.__prepare_create_request(
            api_key, api_base, api_type, api_version, organization, **params
        )

        response, _, api_key = requestor.request(
            "post",
            url,
            params=params,
            headers=headers,
            stream=stream,
            request_id=request_id,
            request_timeout=request_timeout,
        )

        if stream:
            # must be an iterator
            assert not isinstance(response, OpenAIResponse)
            return (
                util.convert_to_openai_object(
                    line,
                    api_key,
                    api_version,
                    organization,
                    engine=engine,
                    plain_old_data=cls.plain_old_data,
                )
                for line in response
            )
        else:
            obj = util.convert_to_openai_object(
                response,
                api_key,
                api_version,
                organization,
                engine=engine,
                plain_old_data=cls.plain_old_data,
            )

            if timeout is not None:
                obj.wait(timeout=timeout or None)

        return obj

To conclude:

The timeout parameter is popped from kwargs in ChatCompletion.create(), so it won’t be passed to super().create() via **kwargs.
The request_timeout parameter, if provided to ChatCompletion.create(), will be passed to super().create() via **kwargs.

Topic		Replies	Views
Timeout not honored in Python API? API	11	4105	June 8, 2023
Timeout for OpenAI chat completion in Python API api , python	6	23702	December 16, 2023
Is "request_timeout" parameter real? API api	5	8296	November 14, 2023
Configuring timeout for ChatCompletion Python API	27	38410	December 12, 2023
ChatCompletion API Call - HANGS without producing response API gpt-35-turbo , chatgpt , api	5	3024	December 16, 2023

Recommended way to limit the amount of time a Python ChatCompletion.create() runs

Related topics