Estimation of measurements in images or PDFs

_j · December 28, 2023, 3:53pm

Obtaining dimensions and bounding boxes from AI vision is a skill called grounding.

You can, for example, see how Azure can augment gpt-4-vision with their own vision products.

Other AI vision products like MiniGPT-v2 - a Hugging Face Space by Vision-CAIR can demonstrate grounding and identification.

Such metrics are needed as a basis for measurement.

gpt-4-vision alone might give you a description and be coaxed into extrapolation, but it is unlikely to be reliable.

Topic		Replies	Views
Can API cut images (such as mathematical figures) from the PDFs? API gpt-4 , api , pdf	7	287	December 3, 2024
Best format to upload a construction plan for extraction of info Prompting gpt-4 , chatgpt , pdf	7	407	April 16, 2025
What is the best way to parse a PDF file with ChatGPT? API	9	49774	November 16, 2024
Getting GPT Vision To Return Coordinates Prompting gpt-4 , gpt-4-vision	9	8352	May 19, 2025
Limitation from resizeing Prompting gpt-4-vision	5	164	September 12, 2024