Why can't the chat API understand the query and return the right number of Boolean response

def select_relevant_news_df2(news_articles, topics, n):
 instructions = f'Your task is to examine a list of News and return a list of boolean values that indicate which of the News are in scope of a list of topics. \
    Return a list of True or False values that indicate the relevance of the News. Make sure you match the total number of news passed.'
    task =  f"{news_articles} /n {topics}?"

    sample = [
        {"role": "user", "content": f"[QCI announces new accreditation standards, Notification on upcoming accreditation seminars, Government circular on accreditation procedures, New study reveals impact of accreditations on industry, QCI role in improving quality standards] /n {topics}?"},
        {"role": "assistant", "content": "[True, True, True, True, True]"},
        {"role": "user", "content": f"[Accreditation-related workshop organized by private organization, Government notification on quality standards in healthcare sector, News article on global accreditation failures, Celebrity endorsements for an apparel brand, Latest movie releases] /n {topics}?"},
        {"role": "assistant", "content": "[True, True, True, False, False]"},
        {"role": "user", "content": f"[Notification from QCI on revised accreditation criteria, Accreditation success stories in manufacturing sector, Governmental policies boosting accreditation, Annual report on sports, Technology gadgets review] /n {topics}?"},
        {"role": "assistant", "content": "[True, True, True, False, False]"},
        {"role": "user", "content": f"[Upcoming QCI event on accreditation assessment techniques, Government notification on accrediting educational institutions, News coverage on global accreditation standards, Fashion trends for the season, Travel destination guide] /n {topics}?"},
        {"role": "assistant", "content": "[True, True, True, False, False]"}]

    return instructions, task, sample

# Example usage
relevant_topics = "[qualtiy council of India, Accreditation, Ministry Notifications, Consultation Papers, Indian Government policies, Indian Ministry, Quality Council, Press Information Bureau, Government Notifications, Health, Education, Indian Sectors, Conferences, Summit, Industry, Environment and Social Governance, Sustainability, Coal, Swachh Bharat, Monuments, Assessment, Market surveillance]"

instructions, task, sample = select_relevant_news_df2(list(Gdeltdf['title']), relevant_topics, len(list(Gdeltdf['title'])))
relevance = openai_request(instructions, task, sample)
relevance_list = eval(relevance)
print("Number of news articles:", len(list(Gdeltdf['title'])))
print("Number of boolean checks done:", len(relevance_list)

Output: Number of news articles: 75
Number of boolean checks done: 95

In the above function I have created to perform a news relevancy check, whenever I run this function, I always get a higher or lower value of Boolean responses given by the gpt-4 API.
Can someone please help me with this. :slight_smile:

Your description of the task is as likely to confuse a human as it is to confuse an AI.

I can at least get you the right answers where the responses match for a single query.

add to your instruction:

// output format

  • the sole AI output is a Python programming language list format.
  • the Python list contains dictionaries with a single entry.
  • the key of each dictionary is the next news article description to be answered about by AI from an input list.
  • the value of the dictionary entry is boolean, answering True or False whether the news article description text can be described and categorized by ANY of a list of provided topics.

// example output
{“Upcoming QCI event on accreditation assessment techniques”: False},
{“Government notification on accrediting educational institutions”: True},

Then write code to parse the dictionaries.

If the output that I document is not what you wanted, then your task surely can’t be understood by AI as described.

Clear presentation will ensure the AI doesn’t present nonsense output based on incomprehensible input.


Thanks for the help. Though it did not make much of a difference, but I understood where I am going wrong.

Will be trying out more ways to get this done. :+1:

The issue keeps on repeating so the only solution I could find is to either match lengths by truncating or padding.

relevance_list = relevance_list[:len(list(Gdeltdf['title']))] + [False] * (len(list(Gdeltdf['title'])) - len(relevance_list))