Using an image as input gpt4 api

rl16 · June 3, 2024, 6:11am

Hi,

I am creating plots in python that i am saving to png files.
I then want to send the png files to the gpt4o api for gpt to analyse the image and then return text.

How do i go about using images as the input?
thanks

anon22939549 · June 3, 2024, 7:19am

from openai import OpenAI

client = OpenAI()

response = client.chat.completions.create(
    model="gpt-4-turbo",
    messages=[
        {
            "role": "user",
            "content": [
                {"type": "text", "text": "What's in this image?"},
                {
                    "type": "image_url",
                    "image_url": "https://upload.wikimedia.org/wikipedia/commons/thumb/d/dd/Gfp-wisconsin-madison-the-nature-boardwalk.jpg/2560px-Gfp-wisconsin-madison-the-nature-boardwalk.jpg",
                },
            ],
        }
    ],
    max_tokens=300,
)

print(response.choices[0])

Source: OpenAI API Reference - Create chat completion

or

import base64
import requests

# OpenAI API Key
api_key = "YOUR_OPENAI_API_KEY"

# Function to encode the image
def encode_image(image_path):
  with open(image_path, "rb") as image_file:
    return base64.b64encode(image_file.read()).decode('utf-8')

# Path to your image
image_path = "path_to_your_image.jpg"

# Getting the base64 string
base64_image = encode_image(image_path)

headers = {
  "Content-Type": "application/json",
  "Authorization": f"Bearer {api_key}"
}

payload = {
  "model": "gpt-4o",
  "messages": [
    {
      "role": "user",
      "content": [
        {
          "type": "text",
          "text": "What’s in this image?"
        },
        {
          "type": "image_url",
          "image_url": {
            "url": f"data:image/jpeg;base64,{base64_image}"
          }
        }
      ]
    }
  ],
  "max_tokens": 300
}

response = requests.post("https://api.openai.com/v1/chat/completions", headers=headers, json=payload)

print(response.json())

Source: OpenAI Vision Guide

rl16 · June 3, 2024, 7:33am

thanks. i encoded the image but apparently it is more than 200k tokens.
does this sound right?

this is the image

InvalidRequestError: This model's maximum context length is 8192 tokens. However, your messages resulted in 243631 tokens. Please reduce the length of the messages.

anon22939549 · June 3, 2024, 7:40am

It is not. You sent the base64 encoded image as text.

You did something like,

        {
          "type": "text",
          "text": f"data:image/jpeg;base64,{base64_image}"
        }

But you need to do,

        {
          "type": "image_url",
          "image_url": {
            "url": f"data:image/jpeg;base64,{base64_image}"
          }
        }

Topic		Replies	Views
How to get GPT4 API to take images as input? API	2	11660	December 19, 2023
Can GPT -vision models be accessed using API? API	9	818	June 26, 2024
How to provide an image to GPT-4? API	3	4051	December 19, 2023
How to get an image described API gpt-4 , api	9	14666	January 30, 2024
How do I use images with the gpt-4 api? API gpt-4	1	979	August 27, 2023

Using an image as input gpt4 api

Related topics