GPT-Vision Performance Improvements

I’m looking for ideas/feedback on how to improve the response time with GPT-Vision. Does anyone know how any of the following contribute to a impact response times:

  1. System message length (e.g. 2 sentences vs 4 paragraphs)
  2. Image size
  3. [ ## Low or high fidelity image understanding
    (https://platform.openai.com/docs/guides/vision/low-or-high-fidelity-image-understanding) via the detail parameter

Are there any other considerations/learnings for faster response time?

For context, in our UI the response can take anywhere from 5-15+ seconds. We have a long system message (~4-5 paragraphs), relatively small image size and detail parameter set to low. The response token length varies but is usually somewhere between 50-200.

1 Like

The bigger the prompt, the longer it will take in general.

You might test both, but I don’t see you shaving more than a couple seconds? It has to do with network traffic on OpenAI’s end and other factors too.

All that said, I imagine the tech will become faster as time goes on. We’re a long way from GPT-2 just a few years ago!