API calls to davinci text 3 very slow and random speeds for identical prompts

Any progress about that ? Same problems here (France)

i agree huge problem it’s so slow and problem it reply sometimes random replies that have no relation with the topic

I’m experiencing the same problems (Michigan, USA).

Edit: I found a solution to the problem, see below.

The same problem with a maximum time of 10 minutes and a random response time of 60s or more

OK. As the title of this topic is “davinci text 3” then I assume everyone is on topic and discussing davinci-text-003.

Just got a completion to work with 600 tokens in 4.6 secs, but it is up and down, up and down,…

I found a hackish solution in Python. Although this thread is davinci-specific in the title, the underlying issue is not model-specific and I think it’s on-topic to use the gpt-3.5 model here. I use the multiprocessing library to launch the ChatGPT request in a separate process. I time-limit the execution of that process and kill it if it takes too long, then send a second request.

I tested this using the following demo code. In my test, the time-limited version (10 tries of 10 seconds max each) has a total runtime of 28 seconds, while the time-extended version (1 try of 100 seconds max each) has a total runtime of 127 seconds, almost entirely due to a single 95 second delay on one of the attempts. By limiting and killing these occasional extremely long delays and resending, I think we can avoid the worst of the problem for now.

Here is my demo code. Feel free to insert into your own projects.

Edit: a user report says that a 15 second delay before retrying results in much better real world performance. Code updated to reflect that change, though it may not make a difference in this test code.

import multiprocessing, time, timeit, openai

openai_key = ""
openai.api_key = openai_key


def sendChatGPTRequest(content, bot_model, queue):
    response =  openai.ChatCompletion.create(
        model=bot_model,
        messages=[{"role": "user", "content": content}],
        max_tokens=1024,
        n=1,
        temperature=0.5,
    )
    queue.put(response["choices"][0]["message"]["content"])

def limitedWait(s, queue):
    start = timeit.default_timer()
    while timeit.default_timer() - start < s and queue.empty():
        continue
    return not queue.empty()

def getChatbotResponse(content, bot_model, max_tries, wait_time):
    start_request = timeit.default_timer()
    max_tries = 1
    for i in range(0, max_tries):
        queue = multiprocessing.Queue()
        p = multiprocessing.Process(target = sendChatGPTRequest, args=(content, bot_model, queue,))
        p.start()

        if limitedWait(wait_time, queue):
            return (queue.get(), timeit.default_timer() - start_request)
        else:
            print("Trying again...")
            p.terminate()
    return (None, max_tries*wait_time)

if __name__ == '__main__':
    total_time = 0
    for i in range(20):
        outcome = getChatbotResponse("Please say 'Hello, World!'", "gpt-3.5-turbo-0301", 10, 15)
        print(outcome[0], "\nTime to generate:", outcome[1])
        total_time += outcome[1]
    print("total time:", total_time)
    total_time = 0
    for i in range(20):
        outcome = getChatbotResponse("Please say 'Hello, World!'", "gpt-3.5-turbo-0301", 1, 150)
        print(outcome[0], "\nTime to generate:", outcome[1])
        total_time += outcome[1]
    print("total time:", total_time)
1 Like

I am having the same issue. It seems that during the last two weeks, everything got much worse. I am passing a prompt with 3-4k tokens to davinci and it takes more than 20-30 secs to respond. More than half of the API calls also result in errors.