Response time is high in 4o why?

Hi there,

I tried 3.5-turbo and 4o for the response. But the response is faster in 3.5 than 4o.
It it because of the model weights? or any else?

3.5-turbo should be the fastest as it is a lightweight model with fewer computational requirements.

I understand there are various evaluations regarding GPT-4o, but considering it is a more advanced model using the GPT-4 architecture, it is generally slower than GPT-3.5-turbo.