GPT-4 API and image input

Hi there,

Is there a documented way to supply GPT-4 API with images?

I couldn’t find anything in OpenAI’s website.

2 Likes

Looks like receiving image inputs will come out at a later time. This is what it said on OpenAI’s document page:
" GPT-4 is a large multimodal model (accepting text inputs and emitting text outputs today, with image inputs coming in the future) that can solve difficult problems with greater accuracy than any of our previous models, thanks to its broader general knowledge and advanced reasoning capabilities. Like gpt-3.5-turbo , GPT-4 is optimized for chat but works well for traditional completions tasks."

1 Like

The GPT-4 is on the GPT 3.5 platform, where users who use the Plus version got access to the GPT-4 almost immediately after the launch. But what about those who have API access? I just got mine, and I’m thinking of creating an API for Discord to try the “same” test that was done in the presence of GPT-4 by OpenAI. Does anyone know anything about it?

1 Like

I swear that I used “Describe this image to me” and then pasted the URL of an image, and GPT-4 described the image perfectly to me earlier this morning. Way better than I thought it would have. Then I tried again for a long time and couldn’t get it to work again. Describing images back to me is the main thing I want to do.

1 Like

It was inferring the image contents from the URL.

1 Like

For now you can look at Visual GPT:

It hooks into a 3rd party image interpreter. It can work for some things, but I am assuming GPT-4’s image recognition will be far more in depth.

2 Likes

That is partly true. On that initial test, I uploaded a random image that I found on Google, and it happened to be a tree in front of a sunset. When I asked GPT to describe the image, it described a sunset and a tree, and I assumed it actually worked.

Then when I tried again later, I got a mixture of three responses. One - It would tell me that it can’t look at images. Two - It would guess what the image was based on the URL. Three - it described a sunset and a tree.

For whatever reason, no matter what image I linked to, it would describe it as a sunset and a tree. It was a random coincidence that I uploaded a picture of a sunset and a tree, which appeared to work.

1 Like