We can certainly do that. The AI might decide to use that information, or might be deceived by the suggestive image name despite its vision capabilities.
Let’s “wrap”. Making a python to call the API.
First, let’s load the image and create the system and user messages containing it.
import os, base64, requests
image_path = "./Martinitoren-Groningen-Martini-Hotel.jpg"
with open(image_path, "rb") as image_file:
base64_image = base64.b64encode(image_file.read()).decode('utf-8')
system = [{"role": "system", "content":"""
You are ChatPal, an AI assistant powered by GPT-4 with computer vision.
AI knowledge cutoff: April 2023
Built-in vision capabilities:
- extract text from image
- describe images
- analyze image contents
""".strip()
}]
user = [{"role": "user", "content": [f"""
Create a fulfilling description of where and what the image depicts.
(user uploaded {image_path})
""".strip(),
{
"image": base64_image,
"resize": 768,
}
]
}]
You can see I’ve informed the AI of vision so it doesn’t produce denials despite its capabilities. The user message has an inserted line with the previous file name (where in practice you should omit or insert exactly what was included), and the encoded image.
Let’s assemble the rest of the API request using the requests library, and send it off for a response.
params = {
"model": "gpt-4-vision-preview",
"max_tokens": 500, "top_p": 0.5,
"messages": system + user
}
headers = {
"Content-Type": "application/json",
"Authorization": f"Bearer {os.environ.get('OPENAI_API_KEY')}"
}
response = requests.post("https://api.openai.com/v1/chat/completions",
headers=headers, json=params)
if response.status_code != 200:
print(f"HTTP error {response.status_code}: {response.text}")
else:
# print(response.json())
print(response.json()['choices'][0]['message']['content'])
print(response.json()['usage'])
So how did the AI do?
The image presents a vibrant urban scene, likely taken on a clear day given the bright blue sky adorned with streaks of cirrus clouds. Dominating the view is the Martinitoren, an iconic bell tower located in Groningen, Netherlands. This historic tower, known for its impressive height and architectural detail, is a significant landmark in the city.
To the right of the Martinitoren, there is a sizable, open square that appears to be bustling with activity. People can be seen walking across the square, enjoying the pleasant weather, and some are seated at outdoor tables, possibly at a café or restaurant, suggesting a lively social atmosphere.
Adjacent to the square are several buildings with distinct European architectural styles, featuring gabled roofs and a variety of façades. Prominent among them is the Martini Hotel, which is directly labeled in the image, providing accommodations likely capitalizing on the views of the square and the tower.
The overall composition of the image, with the Martinitoren rising above the square and the surrounding buildings, creates a sense of historical continuity amidst modern daily life. The scene is a testament to the blend of tradition and contemporary urban living that characterizes many European city centers.
{‘prompt_tokens’: 169, ‘completion_tokens’: 240, ‘total_tokens’: 409}
Identical request without the file name line = tower unidentified
The image presents a lively urban square under a clear blue sky, crisscrossed by the contrails of airplanes, which add a dynamic texture to the heavens. Dominating the scene is a grand, historic bell tower, its architecture suggesting a European heritage, possibly Gothic or Renaissance in style, with its intricate detailing and robust stature. The tower is part of a larger building, likely a church, given its traditional design and the presence of what appears to be a nave extending from its base.
The square itself is spacious and paved, with patches of people scattered across it, some walking and others sitting, enjoying the day. The area is surrounded by a mix of architectural styles, from traditional to modern, with storefronts and signs indicating a commercial zone bustling with activity. Notably, there’s a sign that reads “Hooghoudt,” which could be a reference to a local business or landmark, suggesting this might be a square in a Dutch-speaking region, possibly in the Netherlands, given the style of the buildings and the language on the signs.
Outdoor seating is available in front of what appears to be a café or restaurant, inviting passersby to sit and enjoy the ambiance of the square. The presence of bicycles in the foreground hints at a bike-friendly city with an appreciation for sustainable transportation. Overall, the image captures a moment of everyday life in a European city, where history and modernity blend seamlessly, and the public space is cherished and well-utilized by its inhabitants.
Attempt to fool the AI with Big-ben-tower-london.jpg
The image you’ve provided does not depict the Big Ben tower in London. Instead, it shows a lively European city square under a clear blue sky, crisscrossed with the contrails of airplanes, which create a dynamic pattern overhead. Dominating the scene is a tall, historic bell tower and church with elaborate architecture, including a series of arched windows, decorative stonework, and a multi-tiered spire that culminates in a greenish roof, typical of the Gothic or Renaissance style prevalent in many European cities.
The square itself is spacious and paved, with a scattering of people that suggests a relaxed, leisurely atmosphere. There are several commercial buildings with storefronts around the square, and outdoor seating areas filled with patrons suggest the presence of cafes or restaurants. Notable signs on the buildings include “ACARD” and “Hooghoudt,” which might hint at local businesses or brands.
This place is likely a central gathering spot in the city, possibly for socializing, dining, and enjoying the open space. The presence of the church implies that this could also be a historical or cultural landmark area. The overall ambiance is one of a serene, well-maintained urban environment that balances historical architecture with modern life.