ChatGPT API responses are very slow

Actually, now we’re lucky to have the ‘stream’ option, which allows users to see the GPT generating answers. With this way, people can feel like they’re not stuck. :smiley: Still slow, but now is much better.

How do you achieve the stream option to run?




Defaults to false

If set, partial message deltas will be sent, like in ChatGPT. Tokens will be sent as data-only server-sent events as they become available, with the stream terminated by a data: [DONE] message. See the OpenAI Cookbook for example code.

1 Like

I am also getting same delay . it took 60 s for response. is this resolved

1 Like

gpt-3.5-turbo model API not give same respose as they give resonse in web chat gpt

How i can get same result in my Python API,
I want long respose as gpt web In my python API
Anyone can guide me?

APIConnectionError: Error communicating with OpenAI: HTTPSConnectionPool(host=‘’, port=443): Max retries exceeded with url: /v1/completions (Caused by SSLError(SSLCertVerificationError(1, ‘[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:992)’)))

why this error comes

still 4 times slower that GPT site, 2 times slower than playground, is there a way to avoid this throttle by paying more ?

1 Like

first saying… thanks openai for this great tool. amazing, i could not think 6 months ago were we are now. yes, i also have this really slow response time and thought i am alone. now reading all this i am wondering. what different ai model, why cant we get that latest, with same speed. i already thought how other devs offer their ai app with such slow response, and some say they use chatgpt4, and their responses are fast, much faster. so i thought there are special contracts evt to get a faster access. anyhow as this does not seem so; please openai, its still kind a throttling us down by giving us access to models that are only so slow responding, or real dumb. i mean u know what i mean. i am already thinking switching to a locally hosted model like llama2 etc. still i wish i could keep on with chatgpt. i had a great start with! thanks again.

Bought 50$ worth of credit to upgrade my account to Tier-2 but it’s still the same…

I was also having this problem and worse, I noticed that OpenAI’s API would hang but not kill the connection so it would wait for minutes on end doing nothing. My solution was to thread everything and externally kill the thread after a 30 second timeout and then retry. Here is what that looked like in Python, I had to use Async because neither futures nor Python threading have an effective way to externally terminate a thread.

async def async_openai_request(task_name, messages, chat_container, max_retries=5, attempt_timeout=15):
    url = ""  # Corrected URL

    for attempt in range(max_retries):
            async with httpx.AsyncClient(timeout=attempt_timeout) as client:
                response = await
                        "model": "gpt-3.5-turbo",
                        "messages": messages,
                        "temperature": 0.1  # Adjust temperature as needed
                    headers={"Authorization": f"Bearer {openai.api_key}"}
                response_data = response.json()
                if 'choices' in response_data and len(response_data['choices']) > 0:
                    chat_container[task_name] = response_data['choices'][0].get('message', {}).get('content', '').strip()
                    raise ValueError("Invalid response format from OpenAI API")
                print(f"{task_name}: Success on attempt {attempt + 1}")
        except Exception as error:
            print(f"Attempt {attempt + 1} for {task_name} failed. Error was {error}. Retrying...")

        if attempt < max_retries - 1:
            await asyncio.sleep(2 ** attempt)

    print(f"{task_name}: Failed to get a successful response after {max_retries} attempts.")
    chat_container[task_name] = "Failed after retries"

and here’s what the call looks like. Replace create_translation_prompt with your function.

                await asyncio.gather(
                                         create_translation_prompt(word, example, "Chinese", just_word=True),
                                         chat_container, max_retries=5, attempt_timeout=15),
                                         create_translation_prompt(word, example, "Chinese"),
                                         chat_container, max_retries=5, attempt_timeout=15),
                                         create_translation_prompt(word, example, "Spanish"),
                                         chat_container, max_retries=5, attempt_timeout=15)

I found it works quite quickly if you just terminate the thread and retry. Unless you’re asking it something pretty lengthy (in my case I’m not), 30 seconds should be more than enough for it to respond.

Output looks like this:

Processing: dekket
spanish_translation: Success on attempt 1
Attempt 1 for chinese_example failed. Error was . Retrying...
Attempt 1 for chinese_translation failed. Error was . Retrying...
chinese_translation: Success on attempt 2
Attempt 2 for chinese_example failed. Error was . Retrying...
Attempt 3 for chinese_example failed. Error was . Retrying...
Attempt 4 for chinese_example failed. Error was . Retrying...
chinese_example: Success on attempt 5
{'Chinese': '1. 餐桌已經準備好供晚餐使用。(Cānzhuō yǐjīng zhǔnbèi hǎo gōng wǎncān '
            '2. 地面被雪覆蓋。(Dìmiàn bèi xuě fùgài.)\n'
            '3. 毯子 (Tǎnzi)',
 'Chinese Example': '1. 餐桌已經準備好供晚餐使用。\n2. 地面被雪覆蓋。\n3. 他用一條暖和的毯子捲住自己。',
 'Spanish': '1. Usage (Table Setting):\n'
            '   - Spanish: "puesta"\n'
            '2. Usage (Covered):\n'
            '   - Spanish: "cubierto"\n'
            '3. Usage (Blanket):\n'
            '   - Spanish: "cubrió"'}

(the context was an app that updated flashcards for me)