ChatGPT 3.5 Turbo Vs ChatGPT 4 - API Response Speed

Hi everyone, does anyone have any comparison data on API response speeds for ChatGPT 3.5 Turbo Vs ChatGPT 4. I am trialling out ChatGPT4 on our website gifthuntr.com, but had to revert back to ChatGPT 3.5 because it seemed like the response times were slower. However, I am struggling to measure them accurately. If anyone has any comparison data that would be great!

Hi there, just checking up on this? GPT 4 model seems a bit faster since I posted this, I am having less time outs, but it’s still not fast enough to rollout on our new site moodplaylist com

Under the same conditions I understand that gpt-3.5-turbo is faster (returns outputs with lower latency).

0.5-1m for 3.5, 1-2 minutes for gpt4 in my personal experience on different projects.

1 Like

You can also always write scripts to evaluate time vs your specific needs.

These are similar numbers to what I experienced. Anything past 30seconds is usable for a website. Visitors expect results instantaneously nowadays. OpenAI need to sort their API out, it doesn’t make sense that the free version of ChatGPT 3.5 Turbo is faster than the API version, when we are paying.

2 Likes

Users need feedback.

Give them an animation, and a “thinking…” status, or other status, “the AI was busy, let me try again”, “that’s a complex problem, calculating” when getting a function call back.

Bing has such a problem, it gives no notice that an output from a feature such as image generation is on the way.

Streaming output to show responses as they are formed by the AI is of course nice that they can offer that.

Yes I do this on moodplaylistcom - but still users are pretty unforgiving.

Streaming the output means alot of added complexity

How about “The AI assistant is typing…” (animated ellipsis dots)

Users have already seen and been patient with that type of feedback, even seen on this forum.

Or an alternate case: I spend 15 minutes in notepad writing the specs for my function, reviewing it for completeness, elaborating on anything the AI might miss, then paste. Blam! Instant answer scrolls at 30 tokens a second. Didn’t seem like the AI had to work too hard…