Hey, this is somewhat to be expected. The GPT-4 series models will always be slower than the 3.5T series models. Which model were you using before, if I am understanding you right, your saying the token generation time went from 1 min to 5 min?
1 Like