Unable to directly analyze or view the content of files like (local) images

GSEPE · November 6, 2024, 6:59pm

Hello fellows,
I cannot make GPT-4o via API to read and process a local image.

The console.log(responseContent.choices); returns

[
  {
    index: 0,
    message: {
      role: 'assistant',
      content: `I'm unable to directly analyze or view the content of files like images...

I report here only part of my script.js:

// Function to interact with OpenAI API and show upload progress
async function uploadWithProgress(filePath, mimeType) {
    const fileSize = (await fs.stat(filePath)).size;
    const bar = new ProgressBar(`Uploading [:bar] :percent :etas | ${path.basename(filePath)} `, {
        total: fileSize,
        width: 40,
        complete: '=',
        incomplete: ' ',
        clear: true,
    });

    const messages = [
        {
            role: 'system',
            content: `File Content: ${path.basename(filePath)}`
        },
        {
            role: 'user',
            content: 'please analyze the content of this file and provide the response in JSON format.'
        }
    ];

    const requestBody = JSON.stringify({
        model: API_MODEL,
        messages: messages
    });

    const options = {
        method: 'POST',
        headers: {
            'Content-Type': 'application/json',
            'Authorization': `Bearer ${OPENAI_API_KEY}`,
        },
    };

Am I forced to use type: "image_url", or is there anything I am missing to use local images?

Thank you in advance for any help!

supershaneski · November 6, 2024, 11:24pm

first, you need the vision format (image_url) when sending images. then for local images, use base64.

_j · November 7, 2024, 2:24am

What you are attempting to send is significantly wrong.

You tag this “assistants”, but assistants does not support uploading of files in messages, even if encoded to base64.

Yet you make a requestBody with model and messages, giving hint that you are using chat.completions and not Assistants.

Besides not encoding an image properly and sending it as part of contents of a message, you are trying to put simply an image into a system message where it is not permitted, where instead you should be putting instructions that the AI CAN look at images.

This previous forum post has simple application code showing how to makes a system message for a task, and how a user provides the images, and how they must be encoded.

Or how to create the user message alone

Or how those messages would individually appear as content, with the user providing instructions

So the fault is not “gpt-4o API unable to directly analyze”, but rather “user unable to directly analyze API reference documentation.” Hopefully the links I provided serve as a better documentation.

GSEPE · November 7, 2024, 11:54am

Straight to the point, thanks.
So, the local file can be processed using the image_url argument (after being converted into base64 format).

For those wondering about how to have a small node script.js that can read a local image and return output in JSON format, here you are:

import OpenAI from "openai";
import { OPENAI_API_KEY } from './tokens.js';
import { readFile } from 'fs/promises';

const openai = new OpenAI({ apiKey: OPENAI_API_KEY });
const imagePath = './photo_input/TEST.jpeg';

async function encodeImageToBase64(imagePath) {
  try {
    const data = await readFile(imagePath);
    return data.toString('base64');
  } catch (err) {
    console.error('NOT ABLE TO READ THE FILE:', err);
  }
}

async function main() {
  const base64_image = await encodeImageToBase64(imagePath);
  if (!base64_image) return;

  const response = await openai.chat.completions.create({
    model: "gpt-4o",
    response_format: {
        "type": "json_object"},
    messages: [
        {
            role: 'system',
            content: `You are ...`
        },
        {
        role: "user",
        content: [
          { type: "text", text: "please analyze the content of this image and provide the response in JSON format following this scheme: ..." },
          {
            type: "image_url",
            image_url: {
              "url": `data:image/jpeg;base64,${base64_image}`,
            },
          },
        ],
      },
    ],
  });
  console.log(response.choices[0]);
}

main();

Topic		Replies	Views
Which is correct model for image analysis? API	5	280	December 9, 2024
How to load a local image to gpt4 -vision using API API gpt-4-vision	4	46676	February 27, 2024
Can ChatGPT analyze images using the API? API api	1	3547	July 26, 2024
Can Assistants API understand image files uploaded? API	11	11451	September 28, 2024
Ask GPT-4o about a file - Example python function with file upload base64 and tiktoken and usage history with forced json return API gpt-4o	3	4020	June 8, 2024

Unable to directly analyze or view the content of files like (local) images

Related topics