Image Input with Create Chat Completion

I’m working on a project to try and analyze trading cards so I can just take picture and have it populate an inventory database with the cards and all the information about the cards. I’m running an issue right now where

I have the following code

const response = await generateChatResponse([
      {
        role: "user",
        content: [
          {
            type: "text",
            text: "What's in this image?",
          },
          {
            type: "image_url",
            image_url: {
              url: imageUrl,
            },
          },
        ],
      },
    ]);

Where imageURL is hardcoded right now to an image of the front of the card hosted on uploadthing.

The response I get is the following.

Response from OpenAI: I’m sorry, but I can’t view images directly. If you can describe the image to me, I can help you analyze or understand it better.

The weird thing is, when I set the imageURL to the one which is used in the image input example of the documentation, I get the following output.

“Response from OpenAI: The image depicts a scenic view of the Nature Boardwalk at the University of Wisconsin–Madison’s Lakeshore Nature Preserve. It includes a raised wooden pathway that meanders through a natural setting, likely designed to facilitate the observation and enjoyment of the surrounding flora and fauna. The boardwalk is surrounded by lush greenery, with trees and shrubs on either side. The sky above is clear, indicating a bright and possibly sunny day. This environment provides a serene and inviting space for walking, relaxing, and connecting with nature.”

Can anyone point me in the right direction? I saw a video tutorial which had the user using base64 for their images being passed in and i’m wondering if that would make a difference.

Thanks!

It’s odd, but it seems like newer images don’t work with it? I keep changing the url to different things off google images and it seems to be working with it and outputting something. I will say that the output is really bad compared to playground. When i give the URL for a blue eyes white dragon card, it will throw random things out like dark magician almost likes it’s guessing. Compare that to playground which gets it right every single time and has no issue actually reading other text on the card such as condition and the set it’s from