Hello,
I was trying to submit a batch job for multimodal request with a text and an image. Since the documentation for batch jobs is only single modality (https://platform.openai.com/docs/guides/batch#1-prepare-your-batch-file ), I assumed I could use the format for Images and Vision (https://platform.openai.com/docs/guides/images?api-mode=responses&format=base64-encoded ) to generate something like:
{
“custom_id”: id,
“method”: “POST”,
“url”: “/v1/responses”,
“body”: {
“model”: model,
“messages”: [{
“role”: “user”, “content”: [
{“type”: “input_text”,
“text”: prompt},
{“type”: “input_image”,
“url”: f"data:image/png;base64,{base64_image}"}
]}
}
}
However, I got the following error:
{“error”: {“message”: "Invalid value: ‘input_image’. Supported values are: ‘text’, ‘image_url’, ‘input_audio’, ‘refusal’, ‘audio’, and ‘file’.}
However. upon changing the content type to image_url I got the following error which I could not resolve:
{“error”: {“message”: “Invalid type for ‘messages[0].content[1].image_url’: expected an object, but got a string instead.”}
Does anyone have experience with submitting multimodal (text + image) batched jobs to the API?
2 Likes
sps
April 17, 2025, 4:38pm
2
Welcome to the dev community @Aylin_Akkus
The vision docs for responses API mention this structure:
{
"type": "input_image",
"image_url": "https://upload.wikimedia.org/wikipedia/commons/thumb/d/dd/Gfp-wisconsin-madison-the-nature-boardwalk.jpg/2560px-Gfp-wisconsin-madison-the-nature-boardwalk.jpg",
"detail": "high",
}
I can see that in the sample request shared by you, ”url”
is being used instead of “image_url"
4 Likes
Вот минимальный и рабочий шаблон batch.jsonl для отправки мультимодального (текст + изображение) запроса через OpenAI Batch API:
batch.jsonl:
{
“custom_id”: “image_test_001”,
“method”: “POST”,
“url”: “/v1/chat/completions”,
“body”: {
“model”: “gpt-4-vision-preview”,
“messages”: [
{
“role”: “user”,
“content”: [
{
“type”: “text”,
“text”: “Что изображено на картинке?”
},
{
“type”: “image_url”,
“image_url”: {
“url”: “…”
}
}
]
}
],
“max_tokens”: 1000
}
}
Что тебе нужно сделать:
Заменить data:image/png;base64,… на фактическое base64-энкодированное изображение.
Убедись, что нет лишних пробелов или переносов строк.
Если большое изображение — желательно использовать .webp или уменьшенное .jpg для оптимизации.
1 Like