Issues went calling vison API to give me information on an img

i keep getting this error message ,

"result": "I'm unable to view images directly from URLs. However, if you can provide specific details about the card, such as any text, numbers, symbols, or descriptions visible on it, I can help you identify it. Information such as the card name, edition, year, or any unique characteristics would be very useful."

this is my index.js file,

const express = require('express');
const OpenAI = require('openai');
const dotenv = require('dotenv');

// Load environment variables from .env file
dotenv.config();

const app = express();

// Initialize OpenAI API
const openai = new OpenAI({
  apiKey: process.env.OPENAI_API_KEY
});

async function identifyCardFromImage(imageUrl) {
  try {
    const response = await openai.chat.completions.create({
      model: "gpt-4o",
      messages: [
        { role: 'system', content: 'You are an expert in identifying trading cards.' },
        { role: 'user', content: `Identify the details of this card from the image URL: ${imageUrl}` },
        { role: 'user', content: `Image URL: ${imageUrl}` },
      ],
    });
    return response.choices[0].message.content;
  } catch (error) {
    console.error('Error during OpenAI API request:', error);
    throw error;
  }
}

app.get('/identify-card', async (req, res) => {
  const imageUrl = req.query.imageUrl; // Expecting imageUrl to be passed as query parameter

  if (!imageUrl) {
    return res.status(400).json({ error: 'imageUrl query parameter is required' });
  }

  try {
    const result = await identifyCardFromImage(imageUrl);
    res.json({ result });
  } catch (error) {
    res.status(500).json({ error: error.message });
  }
});

const PORT = process.env.PORT || 3000;
app.listen(PORT, () => {
  console.log(`Server is running on port ${PORT}`);
});

Here’s a modified version of your index.js file that includes a step for extracting text from the image before passing it to the OpenAI model:

const express = require('express');
const { OpenAI } = require('openai');
const dotenv = require('dotenv');
const axios = require('axios');

// Load environment variables from .env file
dotenv.config();

const app = express();

// Initialize OpenAI API
const openai = new OpenAI({
  apiKey: process.env.OPENAI_API_KEY
});

// Function to extract text from image using an image recognition API
async function extractTextFromImage(imageUrl) {
  try {
    // Example using an OCR API (replace with actual API call and handle accordingly)
    const response = await axios.post('YOUR_OCR_API_URL', { imageUrl });
    return response.data.text; // Extracted text from the image
  } catch (error) {
    console.error('Error during OCR API request:', error);
    throw error;
  }
}

async function identifyCardFromText(cardDetails) {
  try {
    const response = await openai.chat.completions.create({
      model: 'gpt-4',
      messages: [
        { role: 'system', content: 'You are an expert in identifying trading cards.' },
        { role: 'user', content: `Identify the details of this card: ${cardDetails}` },
      ],
    });
    return response.choices[0].message.content;
  } catch (error) {
    console.error('Error during OpenAI API request:', error);
    throw error;
  }
}

app.get('/identify-card', async (req, res) => {
  const imageUrl = req.query.imageUrl; // Expecting imageUrl to be passed as query parameter

  if (!imageUrl) {
    return res.status(400).json({ error: 'imageUrl query parameter is required' });
  }

  try {
    const cardDetails = await extractTextFromImage(imageUrl);
    const result = await identifyCardFromText(cardDetails);
    res.json({ result });
  } catch (error) {
    res.status(500).json({ error: error.message });
  }
});

const PORT = process.env.PORT || 3000;
app.listen(PORT, () => {
  console.log(`Server is running on port ${PORT}`);
});

Key Points:

  1. OCR API Integration: Replace the placeholder YOUR_OCR_API_URL with an actual OCR API endpoint.
  2. Error Handling: Ensure that both the OCR and OpenAI API calls have proper error handling.
  3. API Keys: Securely manage and store API keys, and ensure they are loaded correctly from your environment.

This approach should allow you to analyze the image and extract the necessary textual information for further processing.

I have not tested , but it should give you good direction

1 Like

thanks for your input much appreciated but im looking for a way for it to analyze the imagine as an imagine not via txt or matching it using OCR when i ask 4o directly i upload the screenshot and it will send back a response

just update the format of your message to the one for vision

async function identifyCardFromImage(imageUrl) {
  try {
    const response = await openai.chat.completions.create({
      model: "gpt-4o",
      messages: [
        { role: 'system', content: 'You are an expert in identifying trading cards.' },
        {
          role: 'user', content: [
            { 
                type: 'text', 
                text: 'Identify the details of this card from the image' 
            },
            {
                type: 'image_url',
                image_url: {
                    url: imageUrl
                },
            },
          ]
        }
      ],
    });
    return response.choices[0].message.content;
  } catch (error) {
    console.error('Error during OpenAI API request:', error);
    throw error;
  }
}
1 Like