Gpt-4-1106-preview get slow

I’ve been using a very simple langchain code to create a chatbot powered by gpt-4-turbo.

Until December 4th, the speed was consistently fast, but in the past two days, it has suddenly become very slow. (Approximately from 30 tokens/S dropped to 10 tokens/S)

I haven’t made any changes to the code.

Additionally, I have recorded the model’s fingerprint, and it seems that OpenAI began updating the model around December 1st:

  • 2023-12-01 14:35:15,002 - openai_fp_log - model: gpt-4-1106-preview, fingerprint: fp_a24b4d720c
  • 2023-12-01 14:39:27,898 - openai_fp_log - model: gpt-4-1106-preview, fingerprint: fp_a24b4d720c
  • 2023-12-01 14:42:57,834 - openai_fp_log - model: gpt-4-1106-preview, fingerprint: fp_a24b4d720c
  • 2023-12-04 20:00:22,212 - openai_fp_log - model: gpt-4-1106-preview, fingerprint: fp_2eb0b038f6
  • 2023-12-04 20:56:04,723 - openai_fp_log - model: gpt-4-1106-preview, fingerprint: fp_a24b4d720c
  • 2023-12-04 20:59:42,837 - openai_fp_log - model: gpt-4-1106-preview, fingerprint: fp_a24b4d720c
  • 2023-12-05 09:48:43,225 - openai_fp_log - model: gpt-4-1106-preview, fingerprint: fp_d2455ee9e0
  • 2023-12-05 10:10:41,939 - openai_fp_log - model: gpt-4-1106-preview, fingerprint: fp_a24b4d720c
  • 2023-12-05 11:12:09,103 - openai_fp_log - model: gpt-4-1106-preview, fingerprint: fp_a24b4d720c
  • 2023-12-05 11:34:53,838 - openai_fp_log - model: gpt-4-1106-preview, fingerprint: fp_a24b4d720c
  • 2023-12-05 11:43:53,015 - openai_fp_log - model: gpt-4-1106-preview, fingerprint: fp_a24b4d720c
  • 2023-12-05 12:17:15,388 - openai_fp_log - model: gpt-4-1106-preview, fingerprint: fp_a24b4d720c

The code I’m using is as follows:

llm = OpenAI(model_name=_self.openai_model, temperature=0, streaming=True)
chain = ConversationChain(llm=llm, memory=memory, verbose=True)
chain.run("xxxxxx")

Additionally, I’ve noticed that now when inputting the same question, gpt4-turbo’s responses in English are over twice as fast as those in Chinese, despite the response length and semantic content being similar."

Has the speed of the gpt4-turbo model really decreased?Or maybe my code is not adapted to the new version of the model?Will the speed increase in the release version in the future?

1 Like

I’ve also encountered the same issue as you. The gpt-4-1106-preview model has been very slow over the past 1-2 days.

1 Like

Yes, response of this model is very slow. When we switch to GPT-4 model, the response if fast, but we cannot retrieve from files uploaded, so its not much useful.

I’ve also noticed a slow down. Originally turbo was 3 times faster, but now they are approximately the same speed.

As of today (December 25, 2023), the performance has largely recovered to its previous state, indicating that OpenAI is actively refining the model behind the scenes.

Hello everyone, how’s the speed now? Has it improved or is it still slow? I’m considering switching to GPT-4-1106-preview, but I’m unsure if the speed issue persists in that version.