Unexpected output from a basic prompt

justlife · October 11, 2024, 7:42pm

Hi everyone,

I’m currently working on a project that involves using GPT-4 to analyze search terms and assign them to relevant ad groups and campaigns. I’ve provided a very clear knowledge base of available campaign and ad group names to the model, and in my prompt, I explicitly instruct it to:

Only use campaign and ad group names from the provided knowledge base.
Not create, infer, or modify any campaign or ad group names under any circumstances.

However, despite these instructions, GPT-4 seems to be generating its own campaign names that are not part of the knowledge base. These generated names don’t exist in the JSON input files or any other data that I’ve fed into the model.

I’ve gone through multiple iterations to make sure the prompt explicitly restricts the model to only use the data provided in the knowledge base. Here’s the relevant part of the prompt:

“The chosen campaign and ad group should be from the knowledge base. Do not create, infer, or modify any campaign or ad group names.”

The campaign name in the output does not exist anywhere in the knowledge base.

Has anyone else experienced this issue with GPT-4 generating names or entities outside of the provided context? Is there any known solution or adjustment I could make to ensure the model only uses the entities it is given?

Any advice or suggestions would be greatly appreciated!

PaulBellow · October 11, 2024, 7:44pm

Heya.

Sorry to hear you’re having problems.

What model are you using specifically? Just GPT-4?

How long is your system prompt? Can you share the entire prompt of give us an idea of its complexity.

Are you giving it one or two examples of what you want?

My “gut” tells me your system prompt is likely long and a bit confusing maybe.

Yes, it can hallucinate, especially if the instructions aren’t clear enough for it or are overwhelming with too many minor details.

justlife · October 11, 2024, 7:51pm

system_prompt_template = “”"Analyze the search term and evaluate whether it is assigned to the most relevant ad group and campaign as a cluster among knowledge base(don’t generate any adgroup of campaign name, use from knowledge base), based on the theme. The goal is to ensure that each search term is placed in an ad group and campaign where its thematic relevance is maximized, regardless of performance. Check and use only knowledge base to find the best adgroup and campaign for the search term.

Adgroups and Campaigns are a cluster, when an adgroup is under a campaign, you can’t break this relationship.

Guidelines:

Theme Priority: Thematic relevance is the most critical factor. If a more thematically suitable ad group exists, choose that ad group even if the search term’s current performance in the original ad group is acceptable. Always prioritize thematic consistency over performance.

Location Consideration: If the search term contains a location-based keyword (e.g., city or region), check whether the current ad group or any other ad group includes that location in its name. If the current ad group’s location doesn’t match, suggest moving the search term to a more thematically and location-relevant ad group.

Performance Consideration: If the search term performs poorly (e.g., low CTR, high cost per conversion) in its current ad group, but a more thematically and/or location-relevant ad group exists, suggest moving it to the new ad group while providing a warning that performance may need monitoring.

Exclude Irrelevant Terms: If the search term does not fit the theme or location of any ad group in the account, it should be excluded as a negative keyword. If their performance is significant, warn user. Keep their campaign and adgroup name as before.

Make sure that if the search term’s current ad group is the same as the chosen ad group, the action must be “add_to_ad_group” instead of “add_to_new_ad_group.” The action “add_to_new_ad_group” should only be used if the chosen ad group is different from the current one.

When an adgroup is chosen, its campaign should be chosen as well as all adgroups are tied to their campaigns. Don’t pick adgroup without checking the campaign.

IMPORTANT: chosen campaign and adgroup should be from knowledge base. (do not generate any adgroup of campaign name, use from knowledge base).
Do not create, infer, or modify any campaign or ad group names.

Provide the decision in JSON format as:
{{
“search_term”: “{search_term}”,
“action”: “add_to_ad_group” | “add_to_new_ad_group” | “exclude”,
“campaign”: “chosen_campaign”,
“ad_group_name”: “chosen_ad_group”,
“reason”: “Provide a concise explanation of why this action, campaign and ad group is recommended”,
“previous_ad_group”: “{current_ad_group}”,
“previous_campaign”: “{current_campaign}”
}}

Only provide the JSON object above without any additional text or code blocks.“”"

here is how I gather all input for system and user prompt:

def read_ad_group_mapping_json(file_path):
“”“Read the AdGroupMapping JSON file and return a dictionary of mappings.”“”
try:
with open(file_path, ‘r’, encoding=‘utf-8’) as file:
data = json.load(file)

    # data should be a list of dictionaries with 'Campaign' and 'Ad group' keys
    all_campaigns_and_adgroups = {}
    for entry in data:
        campaign = entry.get('Campaign', '').strip()
        ad_group = entry.get('Ad group', '').strip()
        if campaign and ad_group:
            if campaign not in all_campaigns_and_adgroups:
                all_campaigns_and_adgroups[campaign] = set()
            all_campaigns_and_adgroups[campaign].add(ad_group)
    
    # Convert sets to lists for JSON serialization
    for campaign in all_campaigns_and_adgroups:
        all_campaigns_and_adgroups[campaign] = list(all_campaigns_and_adgroups[campaign])
    
    # Estimate token count
    df = pd.DataFrame([(k, v) for k, vs in all_campaigns_and_adgroups.items() for v in vs], 
                      columns=['Campaign', 'Ad group'])
    token_count = estimate_tokens(df.to_json())
    logging.info(f"Estimated token count for Ad Group Mapping file: {token_count}")
    
    return all_campaigns_and_adgroups

except Exception as e:
    logging.error(f"Error reading Ad Group Mapping JSON file: {str(e)}")
    return None

def prepare_tasks(df, system_prompt, all_campaigns_and_adgroups):
required_columns = [‘Search term’, ‘Campaign’, ‘Ad group’]
optional_columns = [‘Impressions’, ‘Clicks’, ‘Cost’, ‘Conversions’, ‘Conv. value’]

# Check if all required columns are present
if not all(col in df.columns for col in required_columns):
    missing = [col for col in required_columns if col not in df.columns]
    raise ValueError(f"Missing required columns: {', '.join(missing)}")

# Prepare the knowledge base string
knowledge_base = "Available Campaigns and Ad Groups:\n"
for campaign, ad_groups in all_campaigns_and_adgroups.items():
    knowledge_base += f"Campaign: {campaign}\n"
    knowledge_base += f"Ad Groups: {', '.join(ad_groups)}\n\n"

# Combine system prompt with knowledge base
full_system_prompt = f"{system_prompt}\n\nKnowledge Base:\n{knowledge_base}"

tasks = []
total_input_tokens = 0
for index, row in df.iterrows():
    current_campaign = safe_str(row['Campaign'])
    current_ad_group = safe_str(row['Ad group'])
    
    user_content = f"Search Term: {row['Search term']}\n"
    user_content += f"Current Campaign: {current_campaign}\n"
    user_content += f"Current Ad Group: {current_ad_group}\n"
    
    # Include optional metrics if they exist
    metrics = ['Impressions', 'Clicks', 'Cost', 'Conversions', 'Conversion Rate', 'CTR', 'Avg. CPC', 'Cost/Conversion']
    for metric in metrics:
        if metric in row:
            user_content += f"{metric}: {row.get(metric, 0)}\n"
    
    # Estimate input tokens
    input_tokens = estimate_tokens(full_system_prompt) + estimate_tokens(user_content)
    total_input_tokens += input_tokens
    
    task = {
        "custom_id": f"task-{index}",
        "method": "POST",
        "url": "/v1/chat/completions",
        "body": {
            "model": "gpt-4o-mini",  # Specify the model here
            "messages": [
                {"role": "system", "content": full_system_prompt},
                {"role": "user", "content": user_content.strip()}
            ]
        }
    }
    tasks.append((json.dumps(task), input_tokens))

avg_input_tokens = total_input_tokens / len(tasks) if tasks else 0
logging.info(f"Average input tokens per search term: {avg_input_tokens:.2f}")

return tasks

PaulBellow · October 11, 2024, 7:57pm

Gave it to 4o as I’m a bit pressed for time, but I think I was on the right track…

It seems like the system prompt might be overly complex, leading to misunderstandings by the LLM. Specifically, the language used, the multiple layers of logic (theme, location, performance, exclusions), and the structure of the JSON output might confuse the model. Here are a few potential issues and ways to improve clarity:
Issues with the Current System Prompt:

Too Many Rules: The system prompt includes multiple instructions for how the LLM should handle theme, location, and performance considerations. This can overwhelm the model and cause it to mix up the priorities.

Complex Instructions: The wording, such as “adgroup and campaign as a cluster” and “can’t break this relationship,” may not be sufficiently clear to the model.

JSON Output Structure: Ensuring the correct JSON format while also processing all the logic might be making the LLM struggle with the reasoning and output generation.

Suggested Simplifications:

Clarify Priorities:

Make the priority system clearer to avoid confusion. Instead of mixing theme, location, and performance together, break these down into simple checks. For example:

First: Check thematic relevance.

Second: If the search term has a location, match with the location-based ad group.

Third: Only if performance is mentioned, address it with a warning.

Remove Unnecessary Complexity:

Simplify the concept of ad group and campaign as a “cluster.” The LLM needs to understand that they are tied together, but it may not understand “cluster” in this context. Replace the term with something simpler like “linked” or “tied together.”

Simplify Output Instructions:

Ensure the JSON structure is clearly highlighted with very simple instructions, ensuring the LLM outputs exactly what you need. You can further emphasize that the JSON object should have no extra text.

Revised System Prompt Suggestion:

Here’s a simplified version of your system prompt:
Analyze the search term and evaluate whether it is assigned to the most thematically relevant ad group and campaign based on the knowledge base provided. Thematic relevance is the top priority.

Guidelines:
- **Theme Priority**: Prioritize thematic relevance. If a more suitable ad group exists based on the theme, suggest moving the search term to that ad group and its campaign, even if the current ad group performs well.
  
- **Location Relevance**: If the search term includes a location-based keyword (city, region), match it to an ad group that also includes that location. If the current ad group does not match the location, suggest a move to the most thematically and location-relevant ad group.

- **Performance Warning**: If the search term performs poorly in its current ad group but has a better thematic or location match elsewhere, suggest moving it, but provide a warning that performance may need to be monitored.

- **Exclusion**: If no suitable ad group exists for the search term, mark it as an exclusion. If performance is significant, warn the user but keep it in the same campaign and ad group as before.

Actions:
- If the search term's current ad group matches the best choice, use the action "add_to_ad_group."
- If the search term belongs to a different ad group, use "add_to_new_ad_group."
- If the search term is irrelevant, use "exclude."

Ensure that the chosen ad group and campaign come from the knowledge base. Do not generate new names for ad groups or campaigns. Ad groups and campaigns are linked, so always check the campaign when assigning an ad group.

Provide the decision in the following JSON format:

{
  "search_term": "{search_term}",
  "action": "add_to_ad_group" | "add_to_new_ad_group" | "exclude",
  "campaign": "chosen_campaign",
  "ad_group_name": "chosen_ad_group",
  "reason": "A brief explanation for why this action, campaign, and ad group were chosen",
  "previous_ad_group": "{current_ad_group}",
  "previous_campaign": "{current_campaign}"
}

Only provide the JSON object above without any additional text or explanation.

Knowledge Base:
[Insert the knowledge base of campaigns and ad groups here.]
Benefits of This Revision:

Clearer Priorities: The LLM should now understand the focus is on thematic relevance, followed by location, and then performance as a secondary concern.

Simplified Instructions: By removing complex phrasing like “cluster” and focusing on simple actions (add_to_ad_group, add_to_new_ad_group, exclude), the LLM should have an easier time generating the correct output.

Clean JSON Output: The revised prompt makes it explicit that the LLM should output only the JSON without any additional commentary or explanation.

Can you test this simplified prompt to see if it improves results?

If not, you might want to try at least a one-shot … ie give it one example of what you want. Costs more tokens, BUT the reliability is a lot higher usually in my experience…

Sorry for AI generated, but I hope it helps!

thinktank · October 11, 2024, 9:02pm

Hiya,

Firstly, I agree, I’ve been having challenges with GPT4o modifying things I don’t want it to modify and ask it explicitly not to modify (several times)… Kinda reminds me of when “ChatGPT got lazy” earlier this year, then became too chatty. AI growing pains.

Prompt

Otherwise I agree with Paul’s instinct, I think your prompt is long and confusing, with several opportunities to truncate.

I think this step may be a good place for function calling, where you can define each ad_group specifically.

Down here with your json output, it’s cut-and-dry case of response_format.

You might also look into turning down the Temperature to reduce creativity.

If you add these functions with Paul’s suggested clearer prompt, will probably see some vast differences.

Check out the new evaluations and distillation methods as well to stabilize better results. I just started using the new metadata options myself. It’s pretty spiffy.

justlife · October 12, 2024, 6:27am

I really couldn’t imagine how to use function calling, can you elaborate?

I thought I can fine tune it with some examples but I think there is a simple error on my side that I miss.

thinktank · October 12, 2024, 5:10pm

Sure, I haven’t used Function Calling yet, so all I have for you is theory, but I’ll give it a go.

Function Calling defines keywords and concepts for the AI to identify, then it can call other functions in your program once they’re activated.

So if you’re having your users use that language, it could call a function leading to the specific data and take a way the guess work.

Fine Tuning should be your last step after all else fails.

Topic		Replies	Views
Converting a ReAct prompt to use function-calling? API	6	7771	July 18, 2023
Getting Frustrated - starting to feel OpenAI just isn't usable API	69	11844	November 24, 2023
My most important function is being called only very rarely API gpt-35-turbo , prompt , functions	7	2351	December 19, 2023
How do you teach end-users how to prompt engineer? Prompting gpt-4	29	5394	August 31, 2023
Function Calling very unreliable Prompting gpt-4 , chatgpt , plugin-development , api	32	32423	December 13, 2023

Unexpected output from a basic prompt

Benefits of This Revision:

Prompt

Related topics