I am currently using the Completion API with GPT-4.0 Mini and have noticed that the response speed is not consistent.
- Sometimes the responses are very fast.
- Other times, the responses are noticeably slow.
When I use GPT-4.0, the response speed is consistently stable, so this issue seems specific to GPT-4.0 Mini.
Could you please investigate why GPT-4.0 Mini has variable performance and suggest any potential optimizations or fixes?
Thank you for your assistance!
Environment Details:
- API: OpenAI Completion API
- Model: GPT-4.0 Mini