4o & turbo models can't read images anymore

It seems like this is the kind of session you’d want to have, with understanding of images when passed in each message, and also a memory of past message images:

As a computer vision assistant, I have the capability to analyze and interpret attached images. By leveraging advanced algorithms and machine learning techniques, I can identify objects, recognize patterns, and extract valuable information from visual data. Whether it’s detecting faces, reading text, classifying scenes, or understanding complex visual content, my ability to process and understand images enables me to assist with a wide range of tasks and applications.
Prompt: what's dangerous in this picture?
Image URL? (enter=none):https://i.imgur.com/B2HDr41.jpeg
In the picture, the child is using a metal fork to retrieve toast from a toaster. This is dangerous for several reasons:

  1. Electrical Shock: Inserting a metal object into a toaster can cause an electrical shock, as metal conducts electricity and can come into contact with the electrical components inside the toaster.

  2. Burns: The toaster and the toast itself can be very hot, potentially causing burns to the child’s hands or face.

  3. Fire Hazard: The toast appears to be burnt, which could indicate that the toaster is overheating or malfunctioning, posing a fire risk.

It’s important to ensure that children are supervised in the kitchen and taught safe practices when using electrical appliances.
Prompt: what color is the toaster?
Image URL? (enter=none):
The toaster in the picture is red.
Prompt: is this picture more dangerous than the first?
Image URL? (enter=none):https://i.imgur.com/gLV8l2c.jpeg
Yes, this picture depicts a more dangerous situation than the first one. In this image, there are two children who appear to be drinking from bottles that are typically used for alcoholic beverages, and an adult is asleep at the table with several bottles of alcohol around. The dangers in this scenario include:

  1. Alcohol Consumption: Children consuming alcohol can lead to severe health issues, including alcohol poisoning, which can be life-threatening.
  2. Lack of Supervision: The adult is asleep and unable to supervise the children, which increases the risk of accidents or further dangerous behavior.
  3. Ingestion of Harmful Substances: Even if the bottles do not contain alcohol, children could be ingesting other harmful substances.

This situation poses significant immediate health risks to the children and indicates a lack of proper supervision and safety measures.
Prompt:

I wrote up the minimum chatbot to do this just for funsies (and capped the chat memory at two user/assistant pairs for cost). In the example, you can see how a single image is optionally added, and the entire user message with the URL is preserved. (“upload” is not a thing, but you can send BASE64 local images instead).

import openai as o
client = o.OpenAI(timeout=60); chat = []
system = [{"role": "system", "content": "You are a computer vision assistant."}]
user = [{"role": "user", "content": "Introduce your ability to see attached images."}]
while not user[0]['content'] == "exit":
    response = client.chat.completions.create(
        messages = system + chat[-4:] + user,
        model="gpt-4o", top_p=0.5, stream=True, max_tokens=200)
    reply = ""
    for delta in response:
        if not delta.choices[0].finish_reason:
            word = delta.choices[0].delta.content or ""
            reply += word
            print(word, end ="")
    chat += user + [{"role": "assistant", "content": reply}]
    user = [{"role": "user", "content": [{"type": "text", "text": input("\nPrompt: ")}]}]
    new_url = input("Image URL? (enter=none):")
    if new_url:
        if not new_url[:4] == "http":
            new_url = "http://" + new_url
        user[0]["content"].append({"type": "image_url",
                               "image_url": {"url": new_url, "detail": "low"}})