GPT-4o Graph Extraction Issue

I use the GPT-4o model to extract questions and handwritten answers from student assignments.
For the text part, GPT-4o-API performs very well, with high accuracy. However, for some problems, especially in geometry and analytical geometry, the questions often include function graphs.
I haven’t found a good way to ask GPT-4o to extract these graphs. Does anyone has any suggestions?
The purpose of this is to separate the question from the answer, so the AI can answer the question again without interference from the original answer. But some questions cannot be answered without diagrams .Can anyone help me achieve this?

Hi!

We don’t do geometry homework problems, but we extract a lot of geometry for other purposes. We use a lot of Sobel trickery to characterize lines, and then let the model determine which lines are which before running computations on them.

It’s not the easiest way, but it’s fairly robust.

Here’s a rough pipeline for lines:

  1. run sobel
  2. plot phase (theta) histogram and annotate peaks (we do a bit of fine-tuning here by hitting either the image or the histogram with a slight gaussian, I forget which)
  3. ask the model to determine phase peaks
  4. compute the lines from the selected phase peak pixels (we have a lot of parallel lines, so we rotate the image so the lines are horizontal and then ask the model if all lines are pertinent, or if stuff is missing)
  5. do your calculations
  6. plot your calculation results in some easy to determine plot and ask the model if the output matches expectations.

we currently don’t need circles, but I’d go for hough circles if we did, although there’s probably challenges with that too if you have concentric stuff. But the general workflow is to use the model to sort through computer vision suggestions to determine what fits best and makes logical sense.

It’s probably not the answer you were looking for, but it’s a solution.

2 Likes