ServiceUnavailableError: The server is overloaded or not ready yet with text-davinci and paid plan

Hello!
I am currently trying to run GPT-3, the ‘davinci’ model, on a long list of prompts - around 160 in a loop. However, I keep getting a ‘ServiceUnavailableError: The server is overloaded or not ready yet.’
I am not very sure how to fix this error. I have my code include an 8 second delay between prompts so as to prevent any overload, but I still get this error. Does anyone have advice?
Here is my code.

for r in rces:
      for re in rlgs:
        open_prompt = rlg_open_g(opg, r, re)
        final_prompt = closed_prompt + open_prompt
        print(open_prompt)
        response_regr = openai.Completion.create(
            model="davinci",
            prompt = final_prompt, logprobs=logp, max_tokens = max_tokens
            )
        response_dict_regr = {'closed_prompt': closed_promptNo,
                        'open_prompt': open_prompt,
                        'prompt_num': q,=
                        'choices': response_regr}
        out_regr.append(response_dict_regr)
        time.sleep(8)

The sleep you’ve added is a good precaution against monopolizing the instance.

Without a payment method, you’ll get a 3 request per minute limit, but a different error message. https://platform.openai.com/account/rate-limits

  1. Handle all exception types instead of letting your script crash on a single error.

Here’s is error-handling code examples, backoff retry scripting.

And just going to my open editor, a template for handling other errors of the python module that could retry or were your input error requiring a fix:

        except openai.error.Timeout as e:
            # Handle timeout error, e.g. retry or log
            print(f"OpenAI API request timed out: {e}")
            pass
        except openai.error.APIError as e:
            # Handle API error, e.g. retry or log
            print(f"OpenAI API returned an API Error: {e}")
            pass
        except openai.error.APIConnectionError as e:
            # Handle connection error, e.g. check network or log
            print(f"OpenAI API request failed to connect: {e}")
            pass
        except openai.error.InvalidRequestError as e:
            # Handle invalid request error, e.g. validate parameters or log
            print(f"OpenAI API request was invalid: {e}")
            pass
        except openai.error.AuthenticationError as e:
            # Handle authentication error, e.g. check credentials or log
            print(f"OpenAI API request was not authorized: {e}")
            pass
        except openai.error.PermissionError as e:
            # Handle permission error, e.g. check scope or log
            print(f"OpenAI API request was not permitted: {e}")
            pass
        except openai.error.RateLimitError as e:
            # Handle rate limit error, e.g. wait or log
            print(f"OpenAI API request exceeded rate limit: {e}")
            pass
        except Exception as e:
            error_message = f"Error: {str(e)}"
            print(error_message)
            self.status_update(False, error_message)
            CustomApplication.processEvents()
            pass

        bot_message = response.choices[0].text.strip()

As for the ultimate cause, the GPT-4 and deprecation announcement stated that 97% of their use is on chat models. One could surmise that then less than 3% of their compute is allocated to davinci (although maybe more due to its impact reflected in the price), making it more susceptible to load variance.

1 Like