GPT-4 API and image input

have you figured it out yet? because i also need to input images using the API and i cannot find anything on the web

There’s no “figuring out” something that hasn’t been released and doesn’t have a solid release date either.

I’m excited about my ‘come wash my car’ AI plugin” doesn’t mean I’m merely hiding the documentation on how to use it.

1 Like

okay thank you, I thought that it was released. no need to be angry about it lol (with peace and love)

1 Like

It’s natural to be excited, and hearing an ambiguous announcement could lead one to think it’s available. OpenAI has had the gpt-4 machine vision (YouTube) trained model internally for over a year - before anybody had heard of ChatGPT.

We’ve seen many announced products from OpenAI slowly trickle out to users in alpha, beta, insider, limited release, tier-1 partner forms also, to find image identification and performance concerns with the ingenuity of a wider audience. Everyone saying “look at this” to the AI on day one, and the service crashing, would not be an ideal outcome if there were release problems.

Knowing that it is coming down, at least, you can consider ways that computer vision and the ability to receive an image description could work in your own application beyond mere novelty.

Does anyone have a way of inputing an image with the API?

1 Like

No image input with the API yet. Whoever gets that superpower first will surely be bragging here :joy:

1 Like

That superpower can be granted for your apps also… $13 billion dollars. API developers have to distinguish themselves from the market lead and pricing model of free and available, from the company they purchase their own services from.

The image you’ve shared is quite intriguing! Here’s a description:

This is a photo-realistic image set in a futuristic factory. The scene is dominated by a row of robotic arms affixed to the ceiling, each holding a human-like head. These heads are intriguingly covered in a rainbow-colored metallic foil and have various wires and cables attached to them. Workers, clad in green lab coats, are diligently working on these heads. The background reveals a well-equipped laboratory, adding to the overall futuristic theme.

As for a title, considering the elements in the image, how about “Synthetic Spectrum: The Birth of Tomorrow”? This title captures both the colorful aspect (Spectrum) and the futuristic theme (The Birth of Tomorrow) of your artwork. Let me know what you think! :blush:

1 Like

I think the image represents all those dudes that chose to cryogenically freeze their head, in hopes of being reborn sometime in the future. Except in the future, the humans are 1/6th the size of todays humans because there was a big push to shrink humans to make things cheaper, including the mystery gnome army employed by cybernetic restoration corporations :man_shrugging:

But I’m not sure where you get the $13 billion dollars comment above, I’m lost.

BTW, not all API developers are trying to make a product to sell to the public. Most of us are making internal products that make our business more efficient.


Did you input the image through API or the web interface?

At the moment, it is only allowed via the web interface.


The image description was generated by Bing chat. Which is available now and the only thing required is to spoof the user-agent string of the Edge browser and you get the gpt-4 computer vision free from billion-dollar investor Microsoft. You can quickly see that having AI language generated about an image that you can just look at yourself has limited long-term appeal or application. It’s not the machine vision that screens a thousand apples a minute.

1 Like

Does anyone know what the $13 billion cash infusion from Microsoft has brought to OpenAI?

More servers, more people, bigger and prettier headquarters?

More money to pay their incredible Azure bill… to Microsoft?

1 Like

hey, psst :wink: Don’t make it too obvious haha.




1 Like

Hey, we urgently need the ‘Image to Text’ feature. Does anyone know when it will be available via the ChatGPT API, or the Azure ChatGPT API?

Probably. Are they telling? No.

I asked GPT4 to guess what the API call would look like it said:

As of my last training data in January 2022, OpenAI hadn’t publicly released a specific “ChatGPT-4” API. My training data includes information up to ChatGPT-3, and there was an API for that.

However, let’s assume the API would be somewhat similar to ChatGPT-3’s. In the case of ChatGPT-3, the API call would be something like:


import requests

def call_chatgpt4_to_transcribe(image_path):
    url = ""  # Assuming hypothetical endpoint

    headers = {
        'Authorization': 'Bearer YOUR_OPENAI_API_KEY',
        'Content-Type': 'application/json',

    data = {
        "prompt": "Transcribe the following image:",  # This is just an example; the actual API might require direct image input
        "image": image_path  # Hypothetical, since ChatGPT-3 doesn't support image input directly

    response =, headers=headers, json=data)
    result = response.json()

    # Assuming the transcribed text is available under the 'choices' key
    return result['choices'][0]['text']