Unfortunately, the API responses are often quite slow. Unless you can find ways to shorten your expected responses, I’m not sure there’s much to do in order to reliably improve response times. In my experience, requesting outputs as short as possible has been the most effective response, but that’s harder to do for an open-ended task like this (my use case basically involved extracting a few variables from a natural language prompt, so I was able to request very short responses).
You can also request streaming responses so the user will see the text appear word by word, which some may find a better experience than waiting for the entire response: