Having OpenAI download images from a URL themselves is inherently problematic. They can be seen as an IP to block, and also, they respect and are overly concerned with robots.txt. Also the image URL can get served a html landing page or wrapper, and can depend on a login. And the image just might not be tolerated, like a webp in a png.
Here’s a script to submit your image file, and see if the AI reports problems. I enhanced for problem-solving. If solved, do your own image-grabbing or file serving.
'''OpenAI gpt-4-vision example script from image file
uses pillow to resize and make png: pip install pillow'''
import base64
from openai import OpenAI
from io import BytesIO
from PIL import Image
def encode_image(image_path, max_image=512):
with Image.open(image_path) as img:
width, height = img.size
max_dim = max(width, height)
if max_dim > max_image:
scale_factor = max_image / max_dim
new_width = int(width * scale_factor)
new_height = int(height * scale_factor)
img = img.resize((new_width, new_height))
buffered = BytesIO()
img.save(buffered, format="PNG")
img_str = base64.b64encode(buffered.getvalue()).decode("utf-8")
return img_str
client = OpenAI()
image_file = "myImage.jpg"
max_size = 512 # set to maximum dimension to allow (512=1 tile, 2048=max)
encoded_string = encode_image(image_file, max_size)
system_prompt = ("You are an expert at analyzing images with computer vision. In case of error, "
"make a full report of the cause of: any issues in receiving, understanding, or describing images")
user = ("Describe the contents and layout of my image.")
apiresponse = client.chat.completions.with_raw_response.create(
model="gpt-4-vision-preview",
messages=[
{"role": "system", "content": system_prompt},
{
"role": "user",
"content": [
{"type": "text", "text": user},
{
"type": "image_url",
"image_url": {"url":
f"data:image/jpeg;base64,{encoded_string}"},
},
],
},
],
max_tokens=500,
)
debug_sent = apiresponse.http_request.content
chat_completion = apiresponse.parse()
print(chat_completion.choices[0].message.content)
print(chat_completion.usage.model_dump())
print(
"remaining-requests: "
f"{apiresponse.headers.get('x-ratelimit-remaining-requests')}"
)
Also a resizer to cut down on your costs. It’s set to 512 for one tile, and 1024, the next logical step.