I find Open AI API works really slow on my project, so I wrote the following little demo to test the time it takes for OpenAI to respond:
import openai
import time
import os
class Chatbot:
def __init__(self):
self.question = "Say hello to me."
self.client = openai.OpenAI(
api_key=os.environ.get("OPENAI_API_KEY"),
base_url=os.environ.get("OPENAI_BASE_URL")
)
def run(self):
while True:
try:
start_time = time.time()
completion = self.client.chat.completions.create(
model = "deepseek-chat",
messages = [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": self.question}
]
)
end_time = time.time()
print('Time used:', end_time - start_time)
response = completion.choices[0].message.content
print(response)
break
except openai.RateLimitError:
print('Rate limit exceeded. Waiting for 2 seconds...')
time.sleep(2)
continue
except Exception as e:
print(e)
break
if __name__ == "__main__":
chatbot = Chatbot()
chatbot.run()
It turns out that even for the question “Say hello to me”, it takes over 30 seconds for the API to respond. For the very long prompt used in my project, the API hasn’t responded for over an hour, I guess it may never make a response. But a week ago, everything works fine.
I’m wondering why this happens? Has anyone met this situation before? By the way, I have tried different base urls, different api keys and different models(gpt40, grok, deepseek-chat) , and all of them responds very slowly. In my understanding different urls correspond to different servers, which means different service handling speed, but why are they all really slow recently?
And I guess it has nothing to do with my computer or network as well, because I’ve deployed the code on other cloud servers and they’re just as slow as my computer does.