I am looking forward for a solution in which two images of the same equipment or location (before and after pictures) to be analyzed for whether the equipment (like a pump or generator) or location (like swimming pool) is cleaner in the after picture compared to the before picture after service is done.
Sufficient training images can be provided. Please let me know if there is a solution available.
Most use cases are not available for different type of analysis.
This can be used for cleaning type of works instead of a human checking the photos and compare.
Additional logic can be dust identification, glossiness etc
1 Like
Hi @JobTholath !
In the context of the OpenAI API ecosystem (without fine-tuning), the easiest is to pass the two images independently to Vision API, and tell it do describe each image in detail. If it’s a very specific domain you are restricting this to (e.g. certain types of equipment and environments), then that’s even better because you can guide the model in the prompt on what to focus on exactly.
Once you have the textual output of “pre” and “post” images, pass those to standard chat completions model to compare.
If you want to venture outside the OpenAI API domain, you may want to perform fine-tuning of one of the state-of-the-art vision models, like OpenAI’s CLIP model, or a YOLO model.
3 Likes
Hi there!
One way I would approach this is to create numerical or qualitative ratings for criteria that you are looking to evaluate (e.g. cleanliness or even more specific criteria). You can use the model to obtain the rating for each image and on the basis of that make a comparison. To enhance the accuracy of the ratings I would use vision fine-tuning.
Just an idea!
3 Likes
Ah yes apologies, as @jr.2509 mentioned you can also use image-based finetuning too, release a week ago! Thanks Jen!
2 Likes