I’m using the @backoff.on_exception(backoff.expo, openai.error.RateLimitError) from backoff library. Trying

for i in rlist: 
    try: 
        #mycode
    except TimeoutError:
        print("error")
        continue

but it still breaks…

1 Like

How do you get those reports of your queries? Is there an OpenAI webpage for that? I dont see that fine grained results in OpenAI API

mark. Yes,Now he’s incredibly slow, and he’s getting slower and slower.I hope the official can improve it as soon as possible

2 Likes

Performance of the OpenAI API is horrible for the moment. Are there plans to improve this soon because this instability in performance is blocking the roll out of our project.

4 Likes

API responses have been consistently 20-50 seconds for about a week now- unusable when ChatGPT itself seems faster than it has ever been

2 Likes

Is there a way to get someone from OpenAI to comment on this? Why are paying customers being rate limited into unusable latencies? The model is supposed to be “turbo” 30-40 seconds is not very “turbo” for some 100s of tokens. The API is wayyyyyy slower than the free chat? Why? I doubt it’s a technical issue, is that a strategic decision to limit developers? If so, I think OpenAI should be more “open” with the community

3 Likes

I think there are too many people using OpenAI API services. Like, I am a bit shocked that people are now saying ‘gpt-3.5-turbo’ is slow, because I remember ‘gpt-3.5-turbo’ had a good speed, with +1000 tokens. So… I feel like the server is packed now.

But my issue is more serious, because my company is using gpt-4, and gpt-4 is way slower though it is accurate. We are about to launch this internally, and I can imagine that our customer service team might say the chat bot is too slow and complain. :sob:

1 Like

totally suffered from the same problem, awesome response, but awful slow :smiling_face_with_tear:

I just signed up for an OpenAI subscription myself, and I expected a similar response time as what ChatGPT is using, but using the gpt-3.5-turbo model, the response times are 30-60 seconds, or timeout completely.

I’m only using it for a demo application, but it’s almost unusable due to this performance, and it’s extra disappointing I had to pay for this to experience this.

1 Like

Actually, now we’re lucky to have the ‘stream’ option, which allows users to see the GPT generating answers. With this way, people can feel like they’re not stuck. :smiley: Still slow, but now is much better.

How do you achieve the stream option to run?

stream

boolean

Optional

Defaults to false

If set, partial message deltas will be sent, like in ChatGPT. Tokens will be sent as data-only server-sent events as they become available, with the stream terminated by a data: [DONE] message. See the OpenAI Cookbook for example code.

https://platform.openai.com/docs/api-reference/chat

1 Like

I am also getting same delay . it took 60 s for response. is this resolved

1 Like

#chatgpt
gpt-3.5-turbo model API not give same respose as they give resonse in web chat gpt

How i can get same result in my Python API,
I want long respose as gpt web In my python API
Anyone can guide me?

APIConnectionError: Error communicating with OpenAI: HTTPSConnectionPool(host=‘api.openai.com’, port=443): Max retries exceeded with url: /v1/completions (Caused by SSLError(SSLCertVerificationError(1, ‘[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:992)’)))

why this error comes

still 4 times slower that GPT site, 2 times slower than playground, is there a way to avoid this throttle by paying more ?

1 Like

first saying… thanks openai for this great tool. amazing, i could not think 6 months ago were we are now. yes, i also have this really slow response time and thought i am alone. now reading all this i am wondering. what different ai model, why cant we get that latest, with same speed. i already thought how other devs offer their ai app with such slow response, and some say they use chatgpt4, and their responses are fast, much faster. so i thought there are special contracts evt to get a faster access. anyhow as this does not seem so; please openai, its still kind a throttling us down by giving us access to models that are only so slow responding, or real dumb. i mean u know what i mean. i am already thinking switching to a locally hosted model like llama2 etc. still i wish i could keep on with chatgpt. i had a great start with! thanks again.

Bought 50$ worth of credit to upgrade my account to Tier-2 but it’s still the same…

I was also having this problem and worse, I noticed that OpenAI’s API would hang but not kill the connection so it would wait for minutes on end doing nothing. My solution was to thread everything and externally kill the thread after a 30 second timeout and then retry. Here is what that looked like in Python, I had to use Async because neither futures nor Python threading have an effective way to externally terminate a thread.

async def async_openai_request(task_name, messages, chat_container, max_retries=5, attempt_timeout=15):
    url = "https://api.openai.com/v1/chat/completions"  # Corrected URL

    for attempt in range(max_retries):
        try:
            async with httpx.AsyncClient(timeout=attempt_timeout) as client:
                response = await client.post(
                    url,
                    json={
                        "model": "gpt-3.5-turbo",
                        "messages": messages,
                        "temperature": 0.1  # Adjust temperature as needed
                    },
                    headers={"Authorization": f"Bearer {openai.api_key}"}
                )
                response_data = response.json()
                if 'choices' in response_data and len(response_data['choices']) > 0:
                    chat_container[task_name] = response_data['choices'][0].get('message', {}).get('content', '').strip()
                else:
                    raise ValueError("Invalid response format from OpenAI API")
                print(f"{task_name}: Success on attempt {attempt + 1}")
                return
        except Exception as error:
            print(f"Attempt {attempt + 1} for {task_name} failed. Error was {error}. Retrying...")

        if attempt < max_retries - 1:
            await asyncio.sleep(2 ** attempt)

    print(f"{task_name}: Failed to get a successful response after {max_retries} attempts.")
    chat_container[task_name] = "Failed after retries"

and here’s what the call looks like. Replace create_translation_prompt with your function.

                await asyncio.gather(
                    async_openai_request("chinese_translation",
                                         create_translation_prompt(word, example, "Chinese", just_word=True),
                                         chat_container, max_retries=5, attempt_timeout=15),
                    async_openai_request("chinese_example",
                                         create_translation_prompt(word, example, "Chinese"),
                                         chat_container, max_retries=5, attempt_timeout=15),
                    async_openai_request("spanish_translation",
                                         create_translation_prompt(word, example, "Spanish"),
                                         chat_container, max_retries=5, attempt_timeout=15)
                )

I found it works quite quickly if you just terminate the thread and retry. Unless you’re asking it something pretty lengthy (in my case I’m not), 30 seconds should be more than enough for it to respond.

Output looks like this:

Processing: dekket
spanish_translation: Success on attempt 1
Attempt 1 for chinese_example failed. Error was . Retrying...
Attempt 1 for chinese_translation failed. Error was . Retrying...
chinese_translation: Success on attempt 2
Attempt 2 for chinese_example failed. Error was . Retrying...
Attempt 3 for chinese_example failed. Error was . Retrying...
Attempt 4 for chinese_example failed. Error was . Retrying...
chinese_example: Success on attempt 5
{'Chinese': '1. 餐桌已經準備好供晚餐使用。(Cānzhuō yǐjīng zhǔnbèi hǎo gōng wǎncān '
            'shǐyòng.)\n'
            '2. 地面被雪覆蓋。(Dìmiàn bèi xuě fùgài.)\n'
            '3. 毯子 (Tǎnzi)',
 'Chinese Example': '1. 餐桌已經準備好供晚餐使用。\n2. 地面被雪覆蓋。\n3. 他用一條暖和的毯子捲住自己。',
 'Spanish': '1. Usage (Table Setting):\n'
            '   - Spanish: "puesta"\n'
            '\n'
            '2. Usage (Covered):\n'
            '   - Spanish: "cubierto"\n'
            '\n'
            '3. Usage (Blanket):\n'
            '   - Spanish: "cubrió"'}

(the context was an app that updated flashcards for me)