ogmios
1
I assume that docs and other things are not accurate or finished yet?
Anyone had any luck being able to send in the detail parameter referred here?
Anyone figured out how to get token cost back too?
2 Likes
This part has costing calculation
Iâm able to send the detail parameter like below
{
"type": "image_url",
"image_url": {
"url": the image url,
"detail": "high"
}
}
ogmios
3
Thanks.
Using their sample code gives me an error 
from openai import OpenAI
import os
from dotenv import load_dotenv
load_dotenv()
client = OpenAI()
response = client.chat.completions.create(
model="gpt-4-vision-preview",
messages=[
{
"role": "user",
"content": [
{"type": "text", "text": "Whatâs in this image?"},
{
"type": "image_url",
"image_url": "https://upload.wikimedia.org/wikipedia/commons/thumb/d/dd/Gfp-wisconsin-madison-the-nature-boardwalk.jpg/2560px-Gfp-wisconsin-madison-the-nature-boardwalk.jpg",
"detail": "high"
},
],
}
],
max_tokens=300,
)
# Display response headers
print(response.choices[0])
Error code: 400 - {'error': {'message': 'Invalid chat format. Unexpected keys in a message content image dict.', 'type': 'invalid_request_error', 'param': None, 'code': None}}
On calculation I did look at it earlier and tried to get an estimate. but was hoping for a way to return exact cost in response.
1 Like
vsg
4
I also had success using detail within the image_url parameter.
Have you tried updating the python library for OpenAI, looks like it was updated 11hrs ago.
ogmios
5
Yup, updated the library. Etc.
I was able to get the basic example working in Postman, using the API endpoint with this payload
{
"model": "gpt-4-vision-preview",
"messages": [
{
"role": "user",
"content": [
{
"type": "text",
"text": "Whatâs in this image?"
},
{
"type": "image_url",
"image_url": {
"url": "https://upload.wikimedia.org/wikipedia/commons/thumb/d/dd/Gfp-wisconsin-madison-the-nature-boardwalk.jpg/2560px-Gfp-wisconsin-madison-the-nature-boardwalk.jpg",
"detail": "low"
}
}
]
}
]
}
and get the token usage in my python script by using
print(response.usage)
But anything i tried so far with detail while using Openai python library gives me errors.
1 Like
kevinv
6
Youâre getting the error âUnexpected keys in a message content image dictâ because the structure expected by the API is a dictionary within a dictionary, while your structure is a flat dictionary.
Youâre placing the âdetailâ: âlowâ within the same dictionary as the âtypeâ: âimage_urlâ like this, but the API expects the image_url key to map to another dictionary that contains both the url and detail keys. This is why the error stated there were unexpected keys in the message content image dict; it wasnât expecting detail to be at the same level as type.
The correct implementation nests the url and detail within another dictionary, assigned to the image_url key.
I fixed it for you and now it works. 
from openai import OpenAI
import os
client = OpenAI(api_key=os.getenv('OPENAI_API_KEY'))
response = client.chat.completions.create(
model="gpt-4-vision-preview",
messages=[
{
"role": "user",
"content": [
{"type": "text", "text": "Whatâs in this image?"},
{
"type": "image_url",
"image_url": {
"url": "https://upload.wikimedia.org/wikipedia/commons/thumb/d/dd/Gfp-wisconsin-madison-the-nature-boardwalk.jpg/2560px-Gfp-wisconsin-madison-the-nature-boardwalk.jpg",
"detail": "low"
}
},
],
}
],
max_tokens=300,
)
# Display response headers
print(response.choices[0])
Donât forget to reimplement dotenv back in again though! I left it out cause I ran above code to verify if itâs working, and I donât use dotenv myself since I put my key into my system settings as per these instructions (see the âSetup your API key for all projectsâ section) and use os.getenv
1 Like
ogmios
7
Thanks, I actually figured it out earlier.
here is sample code if anyone is interested.
def generate_description(frames, detail_level):
print("Encoding frames...")
base64_frames = [
base64.b64encode(cv2.imencode(".jpg", frame)[1]).decode("utf-8")
for frame in frames
]
print("Preparing data URIs...")
data_uris = [
f"data:image/jpeg;base64,{frame}" for frame in base64_frames
]
image_dicts = [
{
"type": "image_url",
"image_url": {
"url": data_uri,
"detail": detail_level
}
}
for data_uri in data_uris
]
prompt_messages = [
{
"role": "user",
"content": [
"These are frames from a video that I want to upload. Generate a compelling description that I can upload along with the video.",
*image_dicts,
],
},
]
params = {
"model": "gpt-4-vision-preview",
"messages": prompt_messages,
"max_tokens": 500,
}