I’m looking for ideas/feedback on how to improve the response time with GPT-Vision. Does anyone know how any of the following contribute to a impact response times:
- System message length (e.g. 2 sentences vs 4 paragraphs)
- Image size
- [ ## Low or high fidelity image understanding
(https://platform.openai.com/docs/guides/vision/low-or-high-fidelity-image-understanding) via thedetail
parameter
Are there any other considerations/learnings for faster response time?
For context, in our UI the response can take anywhere from 5-15+ seconds. We have a long system message (~4-5 paragraphs), relatively small image size and detail
parameter set to low
. The response token length varies but is usually somewhere between 50-200.