Make OpenAI Vision API Match GPT4 Vision

_j · December 5, 2023, 11:53pm

We don’t know how the backend preprocessing of ChatGPT works for image computer vision.

However we do know for API: the image is split into tiles if over 512 pixels in any dimension, and then a read of the main tile plus processing of the subtiles is performed.

Example, where I show a high-quality PDF-to-image rendering using Adobe tools, at the maximum size the API will allow (only 768px wide), and then demonstrate API tile size in red (although they may be evenly divided).

That may add to the confusion, along with the ultimate low resolution. GPT-4-vision for OCR is a poor use of the AI on a nearly-solved problem.

Techniques:

try at max 512 pixels to avoid tiling
try with slices, cutting a page into smaller lengths of text.

Topic		Replies	Views
Can an assistant help me with OCR? API gpt-4	7	3963	June 6, 2024
GPT-4 omni text recognition via API works worse than on chatgpt.com API gpt-4 , api	4	1283	August 13, 2024
How to Extract Data from Images Using OpenAI API? API gpt-4	1	2875	October 18, 2024
Vision API - Through Azure Blind or what am I missing? API gpt-4 , api	3	932	March 4, 2024
Seeking Advice: Enhancing Accuracy of GPT-4 with Vision API gpt-4 , api , adv-data-analytics , gpt-4-vision , gpt4-vision	5	2958	May 15, 2024

Make OpenAI Vision API Match GPT4 Vision

Related topics