Unable to directly analyze or view the content of files like (local) images

Hello fellows,
I cannot make GPT-4o via API to read and process a local image.

The console.log(responseContent.choices); returns

[
  {
    index: 0,
    message: {
      role: 'assistant',
      content: `I'm unable to directly analyze or view the content of files like images...

I report here only part of my script.js:

// Function to interact with OpenAI API and show upload progress
async function uploadWithProgress(filePath, mimeType) {
    const fileSize = (await fs.stat(filePath)).size;
    const bar = new ProgressBar(`Uploading [:bar] :percent :etas | ${path.basename(filePath)} `, {
        total: fileSize,
        width: 40,
        complete: '=',
        incomplete: ' ',
        clear: true,
    });

    const messages = [
        {
            role: 'system',
            content: `File Content: ${path.basename(filePath)}`
        },
        {
            role: 'user',
            content: 'please analyze the content of this file and provide the response in JSON format.'
        }
    ];

    const requestBody = JSON.stringify({
        model: API_MODEL,
        messages: messages
    });

    const options = {
        method: 'POST',
        headers: {
            'Content-Type': 'application/json',
            'Authorization': `Bearer ${OPENAI_API_KEY}`,
        },
    };

Am I forced to use type: "image_url", or is there anything I am missing to use local images?

Thank you in advance for any help!

1 Like

first, you need the vision format (image_url) when sending images. then for local images, use base64.

3 Likes

What you are attempting to send is significantly wrong.

You tag this “assistants”, but assistants does not support uploading of files in messages, even if encoded to base64.

Yet you make a requestBody with model and messages, giving hint that you are using chat.completions and not Assistants.

Besides not encoding an image properly and sending it as part of contents of a message, you are trying to put simply an image into a system message where it is not permitted, where instead you should be putting instructions that the AI CAN look at images.

This previous forum post has simple application code showing how to makes a system message for a task, and how a user provides the images, and how they must be encoded.

Or how to create the user message alone

Or how those messages would individually appear as content, with the user providing instructions

So the fault is not “gpt-4o API unable to directly analyze”, but rather “user unable to directly analyze API reference documentation.” Hopefully the links I provided serve as a better documentation.

1 Like

Straight to the point, thanks.
So, the local file can be processed using the image_url argument (after being converted into base64 format).

For those wondering about how to have a small node script.js that can read a local image and return output in JSON format, here you are:

import OpenAI from "openai";
import { OPENAI_API_KEY } from './tokens.js';
import { readFile } from 'fs/promises';

const openai = new OpenAI({ apiKey: OPENAI_API_KEY });
const imagePath = './photo_input/TEST.jpeg';

async function encodeImageToBase64(imagePath) {
  try {
    const data = await readFile(imagePath);
    return data.toString('base64');
  } catch (err) {
    console.error('NOT ABLE TO READ THE FILE:', err);
  }
}

async function main() {
  const base64_image = await encodeImageToBase64(imagePath);
  if (!base64_image) return;

  const response = await openai.chat.completions.create({
    model: "gpt-4o",
    response_format: {
        "type": "json_object"},
    messages: [
        {
            role: 'system',
            content: `You are ...`
        },
        {
        role: "user",
        content: [
          { type: "text", text: "please analyze the content of this image and provide the response in JSON format following this scheme: ..." },
          {
            type: "image_url",
            image_url: {
              "url": `data:image/jpeg;base64,${base64_image}`,
            },
          },
        ],
      },
    ],
  });
  console.log(response.choices[0]);
}

main();

1 Like