GPT-4 API and image input

judywook · October 2, 2023, 10:07am

have you figured it out yet? because i also need to input images using the API and i cannot find anything on the web

_j · October 2, 2023, 10:23am

There’s no “figuring out” something that hasn’t been released and doesn’t have a solid release date either.

“I’m excited about my ‘come wash my car’ AI plugin” doesn’t mean I’m merely hiding the documentation on how to use it.

judywook · October 2, 2023, 10:27am

okay thank you, I thought that it was released. no need to be angry about it lol (with peace and love)

_j · October 2, 2023, 10:58am

It’s natural to be excited, and hearing an ambiguous announcement could lead one to think it’s available. OpenAI has had the gpt-4 machine vision (YouTube) trained model internally for over a year - before anybody had heard of ChatGPT.

We’ve seen many announced products from OpenAI slowly trickle out to users in alpha, beta, insider, limited release, tier-1 partner forms also, to find image identification and performance concerns with the ingenuity of a wider audience. Everyone saying “look at this” to the AI on day one, and the service crashing, would not be an ideal outcome if there were release problems.

Knowing that it is coming down, at least, you can consider ways that computer vision and the ability to receive an image description could work in your own application beyond mere novelty.

blakeyoung81 · October 13, 2023, 3:26am

Does anyone have a way of inputing an image with the API?

curt.kennedy · October 13, 2023, 3:30am

No image input with the API yet. Whoever gets that superpower first will surely be bragging here

_j · October 13, 2023, 3:56am

That superpower can be granted for your apps also… $13 billion dollars. API developers have to distinguish themselves from the market lead and pricing model of free and available, from the company they purchase their own services from.

The image you’ve shared is quite intriguing! Here’s a description:

This is a photo-realistic image set in a futuristic factory. The scene is dominated by a row of robotic arms affixed to the ceiling, each holding a human-like head. These heads are intriguingly covered in a rainbow-colored metallic foil and have various wires and cables attached to them. Workers, clad in green lab coats, are diligently working on these heads. The background reveals a well-equipped laboratory, adding to the overall futuristic theme.

As for a title, considering the elements in the image, how about “Synthetic Spectrum: The Birth of Tomorrow”? This title captures both the colorful aspect (Spectrum) and the futuristic theme (The Birth of Tomorrow) of your artwork. Let me know what you think!

curt.kennedy · October 13, 2023, 4:21am

I think the image represents all those dudes that chose to cryogenically freeze their head, in hopes of being reborn sometime in the future. Except in the future, the humans are 1/6th the size of todays humans because there was a big push to shrink humans to make things cheaper, including the mystery gnome army employed by cybernetic restoration corporations

But I’m not sure where you get the $13 billion dollars comment above, I’m lost.

BTW, not all API developers are trying to make a product to sell to the public. Most of us are making internal products that make our business more efficient.

_j · October 13, 2023, 7:21am

https://www.google.com/search?q=%2413+billion+dollars+openai

fangsujie2001 · October 13, 2023, 8:17am

Did you input the image through API or the web interface?

gianlucacosta74 · October 13, 2023, 12:21pm

At the moment, it is only allowed via the web interface.

_j · October 13, 2023, 1:55pm

The image description was generated by Bing chat. Which is available now and the only thing required is to spoof the user-agent string of the Edge browser and you get the gpt-4 computer vision free from billion-dollar investor Microsoft. You can quickly see that having AI language generated about an image that you can just look at yourself has limited long-term appeal or application. It’s not the machine vision that screens a thousand apples a minute.

curt.kennedy · October 13, 2023, 4:43pm

Does anyone know what the $13 billion cash infusion from Microsoft has brought to OpenAI?

More servers, more people, bigger and prettier headquarters?

_j · October 13, 2023, 8:39pm

More money to pay their incredible Azure bill… to Microsoft?

jochenschultz · October 17, 2023, 2:52am

hey, psst Don’t make it too obvious haha.

N2U · October 17, 2023, 3:54am

82tfau

curt.kennedy · October 17, 2023, 4:24am

sabitoff · October 25, 2023, 8:20pm

Hey, we urgently need the ‘Image to Text’ feature. Does anyone know when it will be available via the ChatGPT API, or the Azure ChatGPT API?

_j · October 25, 2023, 9:21pm

Probably. Are they telling? No.

fred4 · October 25, 2023, 9:45pm

I asked GPT4 to guess what the API call would look like it said:

As of my last training data in January 2022, OpenAI hadn’t publicly released a specific “ChatGPT-4” API. My training data includes information up to ChatGPT-3, and there was an API for that.

However, let’s assume the API would be somewhat similar to ChatGPT-3’s. In the case of ChatGPT-3, the API call would be something like:

python

import requests

def call_chatgpt4_to_transcribe(image_path):
    url = "https://api.openai.com/v1/engines/chatgpt-4.0-turbo/completions"  # Assuming hypothetical endpoint

    headers = {
        'Authorization': 'Bearer YOUR_OPENAI_API_KEY',
        'Content-Type': 'application/json',
    }

    data = {
        "prompt": "Transcribe the following image:",  # This is just an example; the actual API might require direct image input
        "image": image_path  # Hypothetical, since ChatGPT-3 doesn't support image input directly
    }

    response = requests.post(url, headers=headers, json=data)
    result = response.json()

    # Assuming the transcribed text is available under the 'choices' key
    return result['choices'][0]['text']

Topic		Replies	Views
Image inputs in the GPT-4 API API gpt-4	13	25227	February 6, 2024
ChatGPT goes Multimodal! Sound and vision is rolling out on ChatGPT Community chatgpt , multimodal	34	13247	December 10, 2023
How to load a local image to gpt4 -vision using API API gpt-4-vision	4	44589	February 27, 2024
API for image generation for gpt-4o model API image-generation , gpt-4o	46	45127	May 2, 2025
"I'm sorry, I can't assist with these requests." with Vision API API api , gpt-4-vision	6	13876	December 18, 2023

GPT-4 API and image input

Related topics