Which is correct model for image analysis?

import OpenAI from "openai"
import axios from "axios" // For downloading the image
import fs from "fs/promises" // Optional, for debugging by saving the image locally

const openai = new OpenAI()

async function fetchImageAsBase64(url) {
	try {
		const response = await axios.get(url, { responseType: "arraybuffer" })
		const base64 = Buffer.from(response.data).toString("base64")
		const mimeType = response.headers["content-type"] // Get MIME type from headers
		return `data:${mimeType};base64,${base64}`
	} catch (error) {
		console.error("Failed to fetch the image:", error.message)
		throw error
	}
}

async function imageDescription() {
	try {
		// Example image URL
		const imageUrl = "https://pbs.twimg.com/profile_images/992464734606655488/DVvC0bK6_400x400.jpg"
		const imageBase64 = await fetchImageAsBase64(imageUrl)

		// Send the request to the OpenAI API
		const response = await openai.chat.completions.create({
			model: "gpt-4-turbo",
			max_tokens: 300,
			messages: [
				{ role: "user", content: "What is this picture of?" },
				{ role: "user", content: imageBase64 }
			]
		})

		// Output the response
		console.log("Response:", response.choices[0].message.content)
	} catch (error) {
		console.error("API call failed:", error.response?.data || error.message)
	}
}

imageDescription()

This doesnot works, I tried many none works.

This “messages” format is complete fabrication and fiction.

You need to carefully review the API reference for chat completions and construct a multi-block user content properly, like the example.

Models: https://platform.openai.com/docs/models

Thanks, This works completely fine now—

const openai = new OpenAI()

import OpenAI from "openai"

async function imageDescription() {
	const response = await openai.chat.completions.create({
		model: "gpt-4o",
		messages: [
			{
				role: "user",
				content: [
					{ type: "text", text: "What's in this image?" },
					{
						type: "image_url",
						image_url: {
							url: "https://www.hollywoodreporter.com/wp-content/uploads/2023/05/GettyImages-946730430-H-2023.jpg?w=1296"
						}
					}
				]
			}
		]
	})
	console.log(response.choices[0])
	const assistantMessage = response.choices[0].message.content
	console.log(`AI: ${assistantMessage}`)
	console.log(`Token Usage: ${response.usage.total_tokens} tokens used`)
}
imageDescription()

I was able to do this pretty easily with “vision” in the Assistants API. I have not done it with chat completions though. The Assistant was pretty cranky about the image format when trying to pass it in context.

My workaround was to give the image as a file to GPT, then attach that file to the thread and run the assistant query on it. I tried several different methods of attaching the file, but this seemed the most reliable. When uploading the file to GPT, make sure you give it ‘vision’ and not “assistants” as the purpose.

Can you Please provide your node code?

Here are some snippets. Just make sure to use the file id of the uploaded file in the message add API call. You can also make this more compact by creating the thread with the initial message, but I find it easier to break out the steps when debugging and learning.

       1. add the file to openai. 
        let file = fileCreate(pathToImage, 'vision');

       2. create the thread
        let c_thread = createThread(); 

        3. put a message on the thread - just give the payload the file.id
        let messageid = addImageFile(c_thread, file.id);

  content: [
        {
          "type": "image_file",
          "image_file": {
            "file_id": file.id,     //file id from the reponse object of the upload
            "detail": "auto"
          }
        }
      ],

        4. run the assistant on the thread with whatever your query is.
        const run = runAssistant (c_thread);
1 Like