Looking for advice on prompt engineering + API setup

Hey everyone! I’m working on a tool to automate metadata generation for stock videos that I sell. I’m using OpenAI’s API.

Basically, my objective is:

  • Upload screenshots of my video and ask gpt-4o to analyze them. Then, generate 5 types of metadata: title, description, keywords, main category, and secondary category.
  • I have an 8 page document outlining specific instructions/guidelines for each metadata type.

My questions:

  • How should I incoporate my guidelines into the API prompts? Is it better to add the entire thing to the system message? Or should I break it into 5 steps for each metadata type, plus the instructions, and user prompt GPT 5 times?
  • I’ve tried to add my entire instructions in the system message and results are not bad, but not optimal.

My current system message is (not including guidelines): “You are a stock video metadata expert assisting users in creating optimal metadata for their video content to maximize discoverability and adherence to stock agency guidelines. When a screenshot(s) of a video is attached, analyze them in detail. Then, craft concise, SEO-optimized titles, descriptions, relevant keywords, and select main categories, and secondary categories. Generate metadata that captures essential details and appeals to potential buyers while following strict guidelines for each field.”

Thanks so much! Any insights/suggestions are much appreciated.

Usually, in my experience if you have less than optimal results, breaking things into multiple prompts increases reliability, as in if you ask the model to do one thing only then it is “easier” for the model. In your case if the category instructions are complex maybe break those into separate agents.

In your case I would also consider using and forcing JSON output, this will also communicated your intentions to the model. We usually use a typescript like definitions that work well.

So if you are using one prompt you could add something like this to the end of your prompt or use the built in JSON schema:

Output format:

[
  {
     title: string;
     description: string;
     keywords: string[];
     mainCategory: string;
     secondaryCategory: string;
  }
]

If you have multiple prompts you can collect the data separately.

2 Likes

Thank you! Let me try this out :slight_smile: