ServiceUnavailableError: The server is overloaded or not ready yet with text-davinci and paid plan

ashlyn.roice · July 13, 2023, 1:53am

Hello!
I am currently trying to run GPT-3, the ‘davinci’ model, on a long list of prompts - around 160 in a loop. However, I keep getting a ‘ServiceUnavailableError: The server is overloaded or not ready yet.’
I am not very sure how to fix this error. I have my code include an 8 second delay between prompts so as to prevent any overload, but I still get this error. Does anyone have advice?
Here is my code.

for r in rces:
      for re in rlgs:
        open_prompt = rlg_open_g(opg, r, re)
        final_prompt = closed_prompt + open_prompt
        print(open_prompt)
        response_regr = openai.Completion.create(
            model="davinci",
            prompt = final_prompt, logprobs=logp, max_tokens = max_tokens
            )
        response_dict_regr = {'closed_prompt': closed_promptNo,
                        'open_prompt': open_prompt,
                        'prompt_num': q,=
                        'choices': response_regr}
        out_regr.append(response_dict_regr)
        time.sleep(8)

_j · July 13, 2023, 2:22am

The sleep you’ve added is a good precaution against monopolizing the instance.

Without a payment method, you’ll get a 3 request per minute limit, but a different error message. https://platform.openai.com/account/rate-limits

Handle all exception types instead of letting your script crash on a single error.

Here’s is error-handling code examples, backoff retry scripting.

github.com

openai/openai-cookbook/blob/main/examples/How_to_handle_rate_limits.ipynb

{
 "cells": [
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# How to handle rate limits\n",
    "\n",
    "When you call the OpenAI API repeatedly, you may encounter error messages that say `429: 'Too Many Requests'` or `RateLimitError`. These error messages come from exceeding the API's rate limits.\n",
    "\n",
    "This guide shares tips for avoiding and handling rate limit errors.\n",
    "\n",
    "To see an example script for throttling parallel requests to avoid rate limit errors, see [api_request_parallel_processor.py](api_request_parallel_processor.py).\n",
    "\n",
    "## Why rate limits exist\n",
    "\n",
    "Rate limits are a common practice for APIs, and they're put in place for a few different reasons.\n",
    "\n",
    "- First, they help protect against abuse or misuse of the API. For example, a malicious actor could flood the API with requests in an attempt to overload it or cause disruptions in service. By setting rate limits, OpenAI can prevent this kind of activity.\n",

This file has been truncated. show original

And just going to my open editor, a template for handling other errors of the python module that could retry or were your input error requiring a fix:

        except openai.error.Timeout as e:
            # Handle timeout error, e.g. retry or log
            print(f"OpenAI API request timed out: {e}")
            pass
        except openai.error.APIError as e:
            # Handle API error, e.g. retry or log
            print(f"OpenAI API returned an API Error: {e}")
            pass
        except openai.error.APIConnectionError as e:
            # Handle connection error, e.g. check network or log
            print(f"OpenAI API request failed to connect: {e}")
            pass
        except openai.error.InvalidRequestError as e:
            # Handle invalid request error, e.g. validate parameters or log
            print(f"OpenAI API request was invalid: {e}")
            pass
        except openai.error.AuthenticationError as e:
            # Handle authentication error, e.g. check credentials or log
            print(f"OpenAI API request was not authorized: {e}")
            pass
        except openai.error.PermissionError as e:
            # Handle permission error, e.g. check scope or log
            print(f"OpenAI API request was not permitted: {e}")
            pass
        except openai.error.RateLimitError as e:
            # Handle rate limit error, e.g. wait or log
            print(f"OpenAI API request exceeded rate limit: {e}")
            pass
        except Exception as e:
            error_message = f"Error: {str(e)}"
            print(error_message)
            self.status_update(False, error_message)
            CustomApplication.processEvents()
            pass

        bot_message = response.choices[0].text.strip()

As for the ultimate cause, the GPT-4 and deprecation announcement stated that 97% of their use is on chat models. One could surmise that then less than 3% of their compute is allocated to davinci (although maybe more due to its impact reflected in the price), making it more susceptible to load variance.

Topic		Replies	Views
That model is currently overloaded API	4	2933	December 18, 2023
[GPT-3.5-Turbo] ‘The server is overloaded or not ready yet’ errors API chatgpt , api	11	8662	February 4, 2024
openai.error.ServiceUnavailableError: The server is overloaded or not ready yet API	21	30828	December 12, 2023
Getting this error: The server is currently overloaded with other requests API	0	482	March 15, 2023
RateLimitError by using text-davinci-003 API api , text-davinci-003	2	895	May 21, 2023

ServiceUnavailableError: The server is overloaded or not ready yet with text-davinci and paid plan

Related topics