Issues went calling vison API to give me information on an img

mironconsulting88 · July 25, 2024, 6:19pm

i keep getting this error message ,

"result": "I'm unable to view images directly from URLs. However, if you can provide specific details about the card, such as any text, numbers, symbols, or descriptions visible on it, I can help you identify it. Information such as the card name, edition, year, or any unique characteristics would be very useful."

this is my index.js file,

const express = require('express');
const OpenAI = require('openai');
const dotenv = require('dotenv');

// Load environment variables from .env file
dotenv.config();

const app = express();

// Initialize OpenAI API
const openai = new OpenAI({
  apiKey: process.env.OPENAI_API_KEY
});

async function identifyCardFromImage(imageUrl) {
  try {
    const response = await openai.chat.completions.create({
      model: "gpt-4o",
      messages: [
        { role: 'system', content: 'You are an expert in identifying trading cards.' },
        { role: 'user', content: `Identify the details of this card from the image URL: ${imageUrl}` },
        { role: 'user', content: `Image URL: ${imageUrl}` },
      ],
    });
    return response.choices[0].message.content;
  } catch (error) {
    console.error('Error during OpenAI API request:', error);
    throw error;
  }
}

app.get('/identify-card', async (req, res) => {
  const imageUrl = req.query.imageUrl; // Expecting imageUrl to be passed as query parameter

  if (!imageUrl) {
    return res.status(400).json({ error: 'imageUrl query parameter is required' });
  }

  try {
    const result = await identifyCardFromImage(imageUrl);
    res.json({ result });
  } catch (error) {
    res.status(500).json({ error: error.message });
  }
});

const PORT = process.env.PORT || 3000;
app.listen(PORT, () => {
  console.log(`Server is running on port ${PORT}`);
});

darcschnider · July 25, 2024, 6:27pm

Here’s a modified version of your index.js file that includes a step for extracting text from the image before passing it to the OpenAI model:

const express = require('express');
const { OpenAI } = require('openai');
const dotenv = require('dotenv');
const axios = require('axios');

// Load environment variables from .env file
dotenv.config();

const app = express();

// Initialize OpenAI API
const openai = new OpenAI({
  apiKey: process.env.OPENAI_API_KEY
});

// Function to extract text from image using an image recognition API
async function extractTextFromImage(imageUrl) {
  try {
    // Example using an OCR API (replace with actual API call and handle accordingly)
    const response = await axios.post('YOUR_OCR_API_URL', { imageUrl });
    return response.data.text; // Extracted text from the image
  } catch (error) {
    console.error('Error during OCR API request:', error);
    throw error;
  }
}

async function identifyCardFromText(cardDetails) {
  try {
    const response = await openai.chat.completions.create({
      model: 'gpt-4',
      messages: [
        { role: 'system', content: 'You are an expert in identifying trading cards.' },
        { role: 'user', content: `Identify the details of this card: ${cardDetails}` },
      ],
    });
    return response.choices[0].message.content;
  } catch (error) {
    console.error('Error during OpenAI API request:', error);
    throw error;
  }
}

app.get('/identify-card', async (req, res) => {
  const imageUrl = req.query.imageUrl; // Expecting imageUrl to be passed as query parameter

  if (!imageUrl) {
    return res.status(400).json({ error: 'imageUrl query parameter is required' });
  }

  try {
    const cardDetails = await extractTextFromImage(imageUrl);
    const result = await identifyCardFromText(cardDetails);
    res.json({ result });
  } catch (error) {
    res.status(500).json({ error: error.message });
  }
});

const PORT = process.env.PORT || 3000;
app.listen(PORT, () => {
  console.log(`Server is running on port ${PORT}`);
});

Key Points:

OCR API Integration: Replace the placeholder YOUR_OCR_API_URL with an actual OCR API endpoint.
Error Handling: Ensure that both the OCR and OpenAI API calls have proper error handling.
API Keys: Securely manage and store API keys, and ensure they are loaded correctly from your environment.

This approach should allow you to analyze the image and extract the necessary textual information for further processing.

I have not tested , but it should give you good direction

mironconsulting88 · July 25, 2024, 6:38pm

thanks for your input much appreciated but im looking for a way for it to analyze the imagine as an imagine not via txt or matching it using OCR when i ask 4o directly i upload the screenshot and it will send back a response

supershaneski · July 25, 2024, 11:33pm

just update the format of your message to the one for vision

async function identifyCardFromImage(imageUrl) {
  try {
    const response = await openai.chat.completions.create({
      model: "gpt-4o",
      messages: [
        { role: 'system', content: 'You are an expert in identifying trading cards.' },
        {
          role: 'user', content: [
            { 
                type: 'text', 
                text: 'Identify the details of this card from the image' 
            },
            {
                type: 'image_url',
                image_url: {
                    url: imageUrl
                },
            },
          ]
        }
      ],
    });
    return response.choices[0].message.content;
  } catch (error) {
    console.error('Error during OpenAI API request:', error);
    throw error;
  }
}

Topic		Replies	Views
Image Input with Create Chat Completion API	1	3235	June 17, 2024
Gpt-4o not able to view or analyze images API	4	1743	July 9, 2024
Which is correct model for image analysis? API	5	275	December 9, 2024
GPT-4o Vision help Image input not working API gpt4-vision	1	520	June 17, 2024
Integrating Vision with Assistant API API assistants-api	11	3354	May 16, 2024

Issues went calling vison API to give me information on an img

Key Points:

Related topics