I’ve been using the OpenAI API for some basic testing until now, and I’m planning to integrate it into a project of mine. Here’s the situation: I have a large set of multiple-choice questions, each accompanied by an image, the question itself, and possible answers.
My initial idea was to use the ‘gpt-4-vision-preview’ to analyze the image and explain why a certain answer, let’s say ‘X’, is correct. But then I thought of a different approach. Instead of giving it the correct answer, I could let the model identify the correct answer itself and then provide an explanation. This method, I believe, could add more credibility to its explanations since the model would be independently identifying the correct answer.
In my initial tests yesterday, the model performed flawlessly, not missing a single question. However, in today’s tests, I noticed some errors. The model was confusing certain questions due to similar terms used in the answers. In these cases, the specific terminology is crucial, even though the terms might generally refer to the same thing.
To tackle this, I’m thinking of using ‘gpt-4-vision-preview’ to describe the images in detail and then fine-tuning a model with every guideline from a comprehensive document I have. This might lead to more accurate results.
Since I’m relatively new to the OpenAI API, I’m not entirely sure if this is the best solution. Does anyone have any suggestions or know of any articles that might help?