GPT-4 API to slow when you have to work with a 46 second time out

The only OpenAI solution to language token generation speed is to move the customer-facing AI to gpt-3.5-turbo.

You can see the recent improvement in completion time of 250 tokens of GPT-4 (top blue) that corresponds almost exactly with the load reduction of “GPT-4 no longer making long outputs” complaints.

image

1 Like