Can GPT -vision models be accessed using API?

Although you mentioned that base64 encoding is not necessary for cost reasons, there is no difference in the cost of API calls whether you use base64 encoding or pass a URL.

You can pass an image to the model via API using a URL, but in that case, you will need to host the image as a publicly accessible URL.

The costs associated with using the vision feature include:

  • Whether the image is high-resolution,
  • If high-resolution, the image resolution,
  • The total tokens for the system message, user message, and the model’s response (assistant’s output) describing the image.

If there is no issue with hosting the image on a server just for the model to reference, and making it publicly accessible on the internet as a URL in terms of effort or security risks, that would be fine. However, if that poses a problem, perhaps consider using base64 encoding?

import base64
import requests

# OpenAI API Key

# Function to encode the image
def encode_image(image_path):
  with open(image_path, "rb") as image_file:
    return base64.b64encode('utf-8')

# Path to your image
image_path = "path_to_your_image.jpg"

# Getting the base64 string
base64_image = encode_image(image_path)

headers = {
  "Content-Type": "application/json",
  "Authorization": f"Bearer {api_key}"

payload = {
  "model": "gpt-4o",
  "messages": [
      "role": "user",
      "content": [
          "type": "text",
          "text": "What’s in this image?"
          "type": "image_url",
          "image_url": {
            "url": f"data:image/jpeg;base64,{base64_image}"
  "max_tokens": 300

response ="", headers=headers, json=payload)


This is how you can pass an image to the model using base64 encoding. It is the same method used when attaching an image in the Playground.