Has anyone managed to get gpt-4-vision-preview working in powershell?
Starting on the code here: OpenAI Platform
I’ve converted it to powershell like this:
$apiKey = "............."
$endpoint = "https://api.openai.com/v1/chat/completions"
$headers = @{
"Content-Type" = "application/json"
"Authorization" = "Bearer $apiKey"
}
$imageUrl = "https://www.google.com/images/branding/googlelogo/2x/googlelogo_light_color_272x92dp.png"
# Construct the body payload
$body = @{
"messages" = @(
@{
"role" = "user"
"content" = @(
@{
"type" = "text"
"text" = "What’s in this image?"
},
@{
"type" = "image_url"
"image_url" = @{
"url" = $imageUrl
}
}
)
}
)
"model" = "gpt-4-vision-preview"
"max_tokens" = 3000
} | ConvertTo-Json
# Make the API request
$result = Invoke-RestMethod -Uri $endpoint -Headers $headers -Method Post -Body $body
# Access the response data
$result.choices[0].message.content
However the response seems to imply that it’s just processing the entire “type=image_url” content as text instead of as a formatted request, as the response I get is:
In .NET, a Hashtable is a collection of key-value pairs that are organized based on the hash code of the key [...]
I guess these might be early days and the feature’s not quite working yet, but I don’t see much discussion about it considering the millions of people who are assumedly jumping on board right now.
I think you should add “-Depth #DEPTHLEVEL#” to Convert-Json when using nested arrays.
I also would consider adding -Compress to the Convert-Json as well.
I’ve tried to test here, but my chatgpt-vision is not active.
Once it’s active I will return here with results.
I appreciate your response but the issue was happening before I do any processing of it. After extracting the response’s message content, it was literally a paragraph explaining what a hashtable is.