GPT4-Advanced Data analysis for comparing 2 datasets

I need a prompt that compares 2 sheets- “LOVs” and “Attributes Data”, and lists out the data in the “LOVs” that are mentioned in the “Attributes Data” sheet either in the exact same words or in the same context/meaning.

I created this prompt-
Analyse the following Excel file. It has 2 sheets, Attributes Data and LOVs. Analyse the text in cell A2 of “Attributes Data” sheet. Also analyse the text in cell B2 of the “LOVs” sheet. There are several LOVs in cell B2 that come up in the Attributes Data. I need you to find the LOVs that are mentioned in cell “A2” of the Attributes data sheet either in the exact same words or in the same context/meaning. Then list those LOVs in cell “C2” of “Attributes Data”. Save the updated Excel file for me once you’ve added the required LOVs.

This prompt is working for picking up the exact same “LOVs” that gets repeated in “Attributes Data”. But it is not detecting when the LOVs are presented in the same context/meaning. Eg: If the “Attributes Data” mentions- “Use this scrub daily”, AI should pick up the “LOV” called Daily Maintenance/Daily Use, as it means the same/ is used in the same context. This is where the GPT4-Advanced Data analysis plugin failed for me.

Can someone explain why GPT 4 isn’t picking similar meaning LOVs in this case? and suggest a prompt that will detect contextual meanings, too.

Hi and welcome to the Developer Forum!

Looking at the prompt, you are telling the model to look for the LOVs (List of values?) in B2, at no point do you say to look for anything similar, or close to that, also how similar? how would you define the amount of sameness these values are permitted to have? can they be 0.000001 different? or is a difference of 10000000 ok?

You need to be very exact with what you want the model to do and what restrictions there on on it’s ability to make decisions for itself.

2 Likes

I was working on a system using GPT-3.5 that is capable of finding out what kind of techniques or programming languages are used in a dataset.

And it even found stuff like this:

<?php $skills = ['C#', .... --- The task was to find the skills used in it and since there was a variable called skills and it had an array of "skills" it also found them. I second that. You really need to be precisely!
1 Like