If your account has been targeted for the slowdown, there’s not much you can do.
One theory might be that if you are in Europe, you are routed to a different datacenter with different performance when using a particular account.
Nobody seems to have reported whether they get the same slow speed when moving to hosted US datacenter for their API calls.
You can see if you instead get fast performance when you fine tune a very basic model (10 benign questions, 1 epoch) and then use that ft model. Also, gpt-3.5-turbo-instruct may be fast for you if you can adapt to that model.