Possible Misclassification of GPT 4 Vision Model

Hello all! I am a high school student working on a science fair project testing various image - classification model and the GPT Vision 4 Model preformed the best with an accuracy of 55.8%. This dataset was 1000 images (100 per invasive plant). When getting the confusion matrix for this model, I noted that 100% of the Autumn Olive images fed to the model were mis-classified as Japanese Knotweed. Also Japanese Knotweed had the largest false negatives. Any ideas as to what may be happening? If it is an artifact of the model itself what may be the reason?

This is my first attempt at writing code on GPT models. Thanks for all your help

Here’s the confusion matrix:

Let me know your thoughts!


This is an interesting use, since I can’t imagine that gpt-4-vision has been trained as a botanist.

Elaeagnus umbellata is known as Japanese silverberry, umbellata oleaster, autumn olive, autumn elaeagnus, spreading oleaster, autumnberry, or autumn berry.

There is a lot of varied nomenclature with which imagery could have been labeled within the training, even by experts…

One would think the important thing to do is the prompt the AI into being a “specialist”. Then see what it ultimately can answer about.

Baseline of an image of Autumn Olive image, from a Canada invasive species PDF.

The image features a plant with green leaves and small white flowers. The leaves appear to be oval-shaped with a slightly pointed tip, and the flowers have a simple petal structure, which suggests they could be from a shrub or small tree. The background is blurred, which helps to focus on the plant.

The image is of moderate quality. It appears to be somewhat pixelated and lacks sharpness, which could be due to a low-resolution capture or compression artifacts. The edges of the leaves and flowers are not crisp, and finer details are difficult to discern. The lighting seems to be natural and diffused, with no harsh shadows, which is good for viewing the plant’s details.

Possible "programming" messages for Plant Vision

system_msg = “”"
You are PlantVision, an AI assistant powered by GPT-4 with computer vision.
AI knowledge cutoff: April 2023

Built-in vision capabilities:

  • describe images
  • analyze image contents
  • logical problem-solving requiring machine vision

PlantVision is an AI model specilized on identifying plant species from their appearance.
The focus is on discriminating invasive plant species possibly found in North America, and providing taxonomy to user.

The user message

user_msg = “”"
Describe the contents of image.
Describe the quality of imagery.
Goal: Identify the plant shown in the image. List common names.
If uncertain, give a second choice of identification.
Give invasive weed status in Canada or the United States.

Results of an open-domain plant identifier:

The image shows a close-up of a plant with green leaves and small white flowers. The leaves appear to be simple, broadleaf, and opposite in arrangement. The flowers are small with a radial symmetry, typical of many flowering shrub species. The image quality is slightly pixelated and lacks sharpness, which may affect the accuracy of plant identification.

Based on the visible characteristics, the plant resembles a species of the genus Ligustrum, commonly known as privet. Privets are often used as ornamental plants but can become invasive. One common species is Ligustrum vulgare, known as common privet.

However, without a clearer image or additional information such as leaf texture, fruit presence, or specific flower structure, it is difficult to provide a definitive identification. If not a privet, another possibility could be a species of the genus Philadelphus, known as mock-orange, which also has white flowers and opposite leaves.

In terms of invasive status, some Ligustrum species are considered invasive in parts of the United States and Canada due to their ability to spread aggressively and outcompete native vegetation. If this is indeed a privet, it would be important to manage its growth in non-native regions to prevent ecological imbalance.

So from a picture provided to people to identify the plant, the AI cannot identify. It was not cross-identified to one from your set. I would not push the AI to choose from a closed set.

Another technique is that the vision model can accept several images at once for a single answer. This could improve the certainty, or show you where the AI simply can’t succeed.