I’m encountering an issue with the vision API regarding the handling of multiple images.
For example, when submitting two image URLs and requesting descriptions, I’m able to coax it into mostly returning a valid JSON list of descriptions. However, it’s unclear whether the descriptions are returned in the same order as the URLs provided. This ambiguity prevents me from confidently mapping returned_image_descriptions[0]
to the first image URL, and returned_image_descriptions[1]
to the second. Has anyone else experienced this, and is there a way to ensure the responses correspond deterministically to the order of the submitted image URLs?
I’ve tried making it return a JSON list of objects with a schema like:
{ "$schema": "http://json-schema.org/draft-07/schema#",
"type": "object",
"properties": {
"url": {
"type": "string"
"format": "uri"
},
"description": {
"type": "string"
}
},
"required": [
"url",
"description"
]
}
But the url
fields end up with made-up URLs rather than the actual image urls providing in the request.