Can GPT accomplish this task? If not, please let me know to save my time

  1. The project involves grading exams.
  2. I have tried many times but could not complete the task in one go, so I broke it down into steps to complete it step by step.
  3. Breakdown of steps:
    3.1 Transmit the exam to GPT to extract the printed questions and complete the problem-solving. The answers are generated in a text file “2-2” saved locally.
    3.2 Transmit the exam to GPT to extract the handwritten answers. Request GPT to read the content of the “2-2” text file. Request to compare the handwritten answers with the content of the text file.
    3.3 If the answers match, mark them as correct; if they don’t, mark them as incorrect.
  4. GPT has not completed the above tasks. Each output always states: “Completely matched, the answer is correct.” There was even a time when I mistakenly specified the file path, and the text answers had nothing to do with the handwritten answers, yet GPT still marked them as “correct.”

My question: Is there an issue with my workflow or any specific step?

If GPT currently cannot perform this task, please let the developers inform me that GPT is temporarily unable to complete it. I can wait until it is capable of handling this task rather than wasting time on this project now.

You can let the terms of use inform you if AI is suitable for making educational evaluation decisions against individuals:

When you use our Services you understand and agree:

  • Output may not always be accurate. You should not rely on Output from our Services as a sole source of truth or factual information, or as a substitute for professional advice.

  • You must evaluate Output for accuracy and appropriateness for your use case, including using human review as appropriate, before using or sharing Output from the Services.

  • You must not use any Output relating to a person for any purpose that could have a legal or material impact on that person, such as making credit, educational, employment, housing, insurance, legal, medical, or other important decisions about them.

Thanks for the reply.

However, my current problem is that GPT is not performing the specified tasks correctly. It responds that the task is complete when it is not.

I didn’t expect it to do the entire job on its own. Instead, I’m asking whether GPT can complete the specific tasks as instructed.

that wyh "It is mentioned in the command that when OCR is not available, GPT-Vision is used to extract the handwritten answer part. If GPT-Vision is also unavailable, “Neither OCR nor GPT-Vision is available” is output. But the actual situation is that GPT-Vision failed to successfully extract the handwritten answer part, but it output the fabricated content.

The answer logic is as follows:

Error identification: In the case where OCR is not available, the program should call GPT-Vision, but the extraction status of GPT-Vision is not correctly determined.
Default behavior: When GPT-Vision extraction fails, the system does not correctly identify and trigger the expected error handling process.
Made-up content: Due to a failure to correctly extract handwritten content, the system may have generated default or made-up answers instead of following the logic and outputting preset error prompts.
Therefore, GPT-Vision cannot generate any content when it fails to extract handwritten content. Instead, it should be clearly marked as an extraction failure and output a corresponding error message.“

I’m just looking for the reason. Not criticizing

that is the reason why GPT give the wrong way!
@_j
Is it a bug ?

Picture 1 is the handwritten answer content. Picture 2 is the content of ‘2-2 Answers.txt’. Observation shows that GPT did not finish reading the document ‘2-2 Answer.txt’, but instead fabricated the recognized handwritten answer into, ‘2-2 Answer.txt’. Is this the case?

ChatGPT:

Yes, your observation is correct. GPT did not read the content from ‘2-2 Answers.txt’, but fabricated the recognized handwritten answers as the contents of ‘2-2 Answers.txt’. This indicates that when processing this task, GPT did not correctly complete the steps of extracting the standard answer from the document, but incorrectly used the handwritten answer as the standard answer.

The basic fault is your over-reliance on gpt-4-vision for extraction.

The first thing I would do is optimize the image, and create redundancy. That would be by manually resizing images in code to two resolutions, one that is maximum 512 pixels, and then one that is maximum 1024 pixels. That will constrain the amount of incongruity. Send the first at detail: low. Insert another message text component that says something like “if that wasn’t clear enough, here are zoomed-in views of the image” and then the detail:high image. In chat completions, you can intersperse text and images in one message.

Then have the AI only perform the task of text extraction and diagram description.

With extended effort, if it cannot accurately reproduce what’s given to it by vision, which will be unique to each case, then you’ll likely need to resort to text descriptions of the input, text descriptions of example problem solutions.

Then only use vision on processing solutions.

It is unlikely that the AI will be a time saver here in the overall scheme, when a person could use their eyes and have the experience of scoring dozens of the same question. If the task is not simply having the AI do your homework for you.

Thank you for your reply. After spending dozens of hours on this, I found that GPT tends to fabricate content. Because of this, I admit I am a bit anxious. Through meticulous troubleshooting, I discovered that when GPT-API cannot use OCR or accurately read the text, it generates fabricated output. I am currently addressing this issue. Your suggestions are very helpful to me. I am just a beginner. Thank you.