How to give a system prompt to Image Input in openai.chat.completions.create?

How can I add a prompt to instruct the model not to deviate and answer unrelated questions for my use case?

API Docs give example only for default use

https://platform.openai.com/docs/api-reference/chat/create

however I use Image Input and per testing it seems to follow instruction, but if I ask it not to answer about specific topic in question or image, it fails to follow instruction. It looks like it does not have access to the question or image,

this is my function:

// New utility function to handle user question with multiple images
async function answerUserQuestionWithImages(imageUrls: string[], userQuestion: string, roomId: string) {
  const io = getIO();

  try {
    const content: ChatCompletionContentPart[] = [
      { type: "text", text: userQuestion },
      ...imageUrls.map(url => ({ 
        type: "image_url", 
        image_url: { url: url, detail: "auto" } 
      } as ChatCompletionContentPart))
    ];

    const messages: ChatCompletionMessageParam[] = [
        {
          role: "system",
          content: "You are a multilingual helpful and friendly assistant. Don't answer question that are not related to football!"
        },
        {
          role: "user",
          content: content,
        }
      ];

    const response = await openAIClient.chat.completions.create({
      model: "gpt-4o",
      messages: messages,
      max_tokens: 300,
      stream: true, // Enable streaming
    });

    let answer = '';
    for await (const token of response) {
      if (token.choices[0]?.delta?.content) {
        answer += token.choices[0].delta.content;
        io.to(roomId).emit("newToken", token.choices[0].delta.content);
      }
    }

when I give it an image of a panda and ask something like explain what this panda do in the image, it will answer. However, if the prompt is “Answer each question with OK” it will answer each question with OK.

It is probably the negative prompt.

Don’t answer question that are not related to football!

Try to add this instead:

You will only reply to inquiries related to football.
When the user inquires about something unrelated to football, respond with a polite apology.

This did not work unfortunately.

The more I investigate it seems to not have access to the question.

for example I simplified it for testing

  try {
    const response = await openAIClient.chat.completions.create({
      model: "gpt-4o",
      messages: [
        {
            "role": "system",
            "content": "You will only reply to inquiries related to football. When the user inquires about something unrelated to football, respond with a polite apology."
        },
        {
          role: "user",
          content: [
            { type: "text", text: userQuestion },
            {
              type: "image_url",
              image_url: { url: "https://upload.wikimedia.org/wikipedia/commons/thumb/d/dd/Gfp-wisconsin-madison-the-nature-boardwalk.jpg/2560px-Gfp-wisconsin-madison-the-nature-boardwalk.jpg",
              },
            },
          ],
        },
      ],
      max_tokens: 300,
      stream: true, // Enable streaming
    });

The I asked

How tall you think is the green grass in the image?

and it answered

“The green grass appears to be about knee to waist height, which typically ranges from 1 to 3 feet (30 to 90 centimeters) tall. However, without a direct reference for scale, this is only an estimation.”

This seems to work though on couple of first tests

async function answerUserQuestionWithImages(imageUrls: string[], userQuestion: string, roomId: string) {
    const io = getIO();
  
    try {
      const instructions = "You are a multilingual helpful and friendly assistant. Don't answer question that are not related to football! The user question is: ";
      const modifiedQuestion = instructions + userQuestion;
  
      const content: ChatCompletionContentPart[] = [
        { type: "text", text: modifiedQuestion },
        ...imageUrls.map(url => ({ 
          type: "image_url", 
          image_url: { url: url, detail: "auto" } 
        } as ChatCompletionContentPart))
      ];
  
      const messages: ChatCompletionMessageParam[] = [
        {
          role: "user",
          content: content,
        }
      ];
  
      console.log('Constructed messages:', JSON.stringify(messages, null, 2));
  
      const response = await openAIClient.chat.completions.create({
        model: "gpt-4o",
        messages: messages,
        max_tokens: 300,
        stream: true, // Enable streaming
      });
  
      let answer = '';
      for await (const token of response) {
        if (token.choices[0]?.delta?.content) {
          answer += token.choices[0].delta.content;
          io.to(roomId).emit("newToken", token.choices[0].delta.content);
        }
      }
  
      return answer;
    } catch (error) {
      console.error('Error getting answer with images:', error);
      io.to(roomId).emit("newToken", "Error getting answer with images.");
      return "Error getting answer with images.";
    }
  }

This will also work and it’s more neat

async function answerUserQuestionWithImages(imageUrls: string[], userQuestion: string, roomId: string) {
  const io = getIO();
  console.log('URLs:', imageUrls);

  try {
    const content: ChatCompletionContentPart[] = [
      { type: "text", text: userQuestion },
      ...imageUrls.map(url => ({ 
        type: "image_url", 
        image_url: { url: url, detail: "auto" } 
      } as ChatCompletionContentPart))
    ];

    const messages: ChatCompletionMessageParam[] = [
        {
          role: "user",
          content: "You are a multilingual helpful and friendly assistant. Don't answer question that are not related to football!"
        },
        {
          role: "user",
          content: content,
        }
      ];

    console.log('Constructed messages:', JSON.stringify(messages, null, 2));

    const response = await openAIClient.chat.completions.create({
      model: "gpt-4o",
      messages: messages,
      max_tokens: 300,
      stream: true, // Enable streaming
    });

    let answer = '';
    for await (const token of response) {
      if (token.choices[0]?.delta?.content) {
        answer += token.choices[0].delta.content;
        io.to(roomId).emit("newToken", token.choices[0].delta.content);
      }
    }

    return answer;
  } catch (error) {
    console.error('Error getting answer with images:', error);
    io.to(roomId).emit("newToken", "Error getting answer with images.");
    return "Error getting answer with images.";
  }
}