Prompt Engineering Help for Fuzzy Matching reasoning

chirag.shah285 · October 24, 2023, 4:11am

Hey all, I’m struggle to get this prompt right, any tips would be greatly appreciated:

My ultimate objective: For the LLM to reason over what a close match vs non make sense. It does a great job in most cases but current problem is it never chooses N/A. Using GPT.35-Turbo, GPT4 gets this right.
‘’’
PROMPT:
You are an excellent matching expert. You can look at data and find the closest match to that data from a list. Your goal is to find the closest match to the data from the list of potential matches, if there is no close match write N/A or provide the correct closest match with no explanation. Take a deep breath, and you can do this.

For example:


## Data

The Group 5710 Meyerfield Court 2023-07-24

## Pontential Matches

['The Group 9431 Turnberry Drive, Potomac, MD 20854 2023-04-17',

'The Group 9213 Potomac School Drive, Potomac, MD 20854 2023-07-25',

'Margie Halem Group 2807 Balliett Court, Vienna, VA 22180 2023-07-11',

'The Group 277 Gundry Drive, Falls Church, VA 22046 2023-07-10']

## Response

N/A

## Data

Andrew Addy 124 Bucktown Crossing Road Apt 31C, Pottstown, PA 19465 2023-04-07

## Pontential Matches

['Andrew Addy 124 Bucktown Xing Rd, Pottstown, PA 19465 2023-04-07','Andrew Addy 104 Foster Ave, Upper Darby, PA 19082 2023-03-29','Andrew Addy 312 Long Ridge Ln, Exton, PA 19341 2023-03-02','Andrew Addy 3801 Davis Court, Chester Springs, PA 19425 2023-08-07','Andrew Addy 1206 Worthington Dr, Exton, PA 19341 2023-06-01']

## Response

Andrew Addy 124 Bucktown Xing Rd, Pottstown, PA 19465 2023-04-07
''''

*

wclayf · October 24, 2023, 5:44am

Try telling the AI to rank order things it needs to match so it never has to say “no match”. Or make it give a “match” score ranging from 0 to 100.

LLMs probably aren’t good at saying “no match” because it’s inherently a pattern matching system from the ground up. But if you tell it to order things or generate matching scores that’s an offer it cannot refuse.

chirag.shah285 · October 24, 2023, 12:42pm

Thanks @wclayf . I’ll give that a shot, would I be Better off fine tuning a model?

Macha · October 25, 2023, 6:32am

You don’t need to fine-tune for this, I would just add and describe in what way are they supposed to match. Also, you’re telling it to be an excellent matching expert, which, if it abides and is a true expert, would almost never find a “no match” case because it’s supposed to be an expert matcher ;).
Also, this sounds a lot like fuzzy set theory in linguistics. I feel like there was another name for it I learned, but I can’t think of it right now.
You can also see relevant similarities between model outputs using cosine similarity functions.
Once you understand those, it’s easier to guide the model to perform those tests. I would start there and see if that achieves what you want.
Article here:
https://www.researchgate.net/publication/234784106_A_fuzzy_sets_based_linguistic_approach_Theory_and_applications

Remember, there’s different ways to “match” data and text strings. Some of it linguistic, some computational, syllabic, word count, token count/token complexity, etc. There’s no single-handed example to understand which way it’s supposed to match the data. You have to describe that yourself in the prompt.

Hope this helps!

chirag.shah285 · October 25, 2023, 10:54am

@Macha thank you so much for the additional perspective.

wclayf · October 25, 2023, 4:27pm

Just FYI, I agree with @Macha 100%, and especially what he said about using a VectorDb with cosine similarity.

chirag.shah285 · October 25, 2023, 5:39pm

@wclayf @Macha im using cosine similarly to get the top 5 potential matches and then I want to the LLM to reason as to what is the best match.

Foxalabs · October 25, 2023, 5:53pm

I think that will be an excellent use of AI and embeddings, it adds an extra layer but I think you will get good results.

Macha · October 25, 2023, 6:19pm

I’m glad I could help! I appreciate the feedback.

wclayf · October 25, 2023, 9:30pm

After this conversation, I went back and now I actually notice there are at least 3 ways “closest match” can be done. Temporally (when), Spacially (where), and Semantically (word meanings).

Are you telling the AI which of these three matching criteria determine the matching score?

chirag.shah285 · October 25, 2023, 11:41pm

I’m not, any ideas how to work the feedback into my prompt.

Macha · October 26, 2023, 3:27am

Welcome to the complexity of language and why I love linguistics!
The answer to that question depends precisely on what you’re trying to look for and why you’re comparing the data.
What your trying to do is something that looks very easy on the surface, but is much, much more complex once you start taking a more intricate look at the problem. I just finished my BA in Applied Linguistics before ChatGPT got widely released of course, but this is why I wanted to study linguistics and how language works. It’s not easy.
This is also why cosine similarity can still be difficult to interpret and is a function typically performed by NLP researchers.
I can make my best educated guess that you are looking for semantic similarity. That’s typically what most people are looking for. Thankfully, semantic similarity search is actually a thing!
You can also pick one of the categories and express that it must strictly match based off that category (like semantically matching them), and explain its reasoning. By asking for it’s reasoning, you can see how it thought to match them based upon your selected criteria, and adjust as necessary. Or keep asking on the forum!

I’m quite literally in the process of providing and posting some of my own prompt techniques on this forum as well. So, if you’re still struggling, I’m hoping to start posting new prompts and guides for people to try out that -may- help you out.

Foxalabs · October 26, 2023, 9:30am

The perfect prompt engineer will soon be the AI itself, but before that happens the best person for the task would be a combination of an English Language expert and an Interrogator. Language understanding with Neurolinguistic Programming elements.

Innovatix · October 26, 2023, 9:39am

Totally agree with you

b0zal · October 26, 2023, 9:47am

Test Using this Models gpt-3.5-turbo-16k-0613

Topics: Potential Matches for Data

Message From ChatGPT:

You are an excellent matching expert. You can look at data and find the closest match to that data from a list. Your goal is to find the closest match to the data from the list of potential matches, if there is no close match write N/A or provide the correct closest match with no explanation. Take a deep breath, and you can do this.

Message From ChatGPT:

You are an assistant that Potential Matches for Data

Message From You:

## Data

The Group 5710 Meyerfield Court 2023-07-24

## Pontential Matches

['The Group 9431 Turnberry Drive, Potomac, MD 20854 2023-04-17',

'The Group 9213 Potomac School Drive, Potomac, MD 20854 2023-07-25',

'Margie Halem Group 2807 Balliett Court, Vienna, VA 22180 2023-07-11',

'The Group 277 Gundry Drive, Falls Church, VA 22046 2023-07-10']

## Response

N/A

## Data

Andrew Addy 124 Bucktown Crossing Road Apt 31C, Pottstown, PA 19465 2023-04-07

## Pontential Matches

['Andrew Addy 124 Bucktown Xing Rd, Pottstown, PA 19465 2023-04-07','Andrew Addy 104 Foster Ave, Upper Darby, PA 19082 2023-03-29','Andrew Addy 312 Long Ridge Ln, Exton, PA 19341 2023-03-02','Andrew Addy 3801 Davis Court, Chester Springs, PA 19425 2023-08-07','Andrew Addy 1206 Worthington Dr, Exton, PA 19341 2023-06-01']

## Response

Andrew Addy 124 Bucktown Xing Rd, Pottstown, PA 19465 2023-04-07

Message From ChatGPT:

The closest match for the first data is:

‘The Group 9213 Potomac School Drive, Potomac, MD 20854 2023-07-25’

The closest match for the second data is:

‘Andrew Addy 124 Bucktown Xing Rd, Pottstown, PA 19465 2023-04-07’

Macha · October 26, 2023, 7:32pm

@Foxalabs I’d argue even further that it’s a good combo of a language expert and an investigator. I feel like that explains my natural knack for prompting better, but interrogator also works. Maybe inquisitor?

Foxalabs · October 26, 2023, 7:33pm

Inquisitor is the perfect blend I think, plus is has a 40K ring to it which is a bonus.

zenz · October 27, 2023, 6:33pm

Totally agree. Stephen Wolfram has commented in a number of recent-ish interviews and lectures that expository writing is the key skill to employ. Jives completely with my own experiences. If I get lazy, my interactions become less useful. Maintaining a high level of intent and consistency with prompt flows goes a long way.

Unpacking interrogation a bit here – I approach this using reflexion, and then “adversarial hypotheticals”. E.g., “Thank you, this legal document looks good. Now, I’d like you to take the hypothetical position as the counter-party’s attorney…”

Foxalabs · October 27, 2023, 6:45pm

This is one of the reasons I agree with many at OpenAI that prompt engineering will be a short term occupation, anything that requires constant effort to maintain a high level of competency in will tend to become quite a niche job.

The AI’s are already very good at prompting themselves, and we are working with the absolute worst AI will ever be. I think it will be less than 12-18 months before prompting is mostly done by the models themselves with very little in the way of effort required, the AI’s will understand the context and what would be expected for a given situation and will do so automatically unless given corrective instruction, much the same as you let a competent college just get on with a job when you know they grasp the task at hand.

zenz · October 27, 2023, 6:56pm

It seems to be almost necessary that the AIs prompt themselves, especially given the trend in mixture models and then the ever increasing need for multi-model orchestration dressed up as internal dialogue.

Once they set these models on a permanent run loop that has them either focusing on their own input/output internal dialogue, sensory input, or user input, the need for the models to self-prompt will grow.

Topic		Replies	Views
How do you teach end-users how to prompt engineer? Prompting gpt-4	29	5276	August 31, 2023
Providing context to the Chat API before a conversation Prompting gpt-4 , gpt-35-turbo , chatml , chatml-system , chatml-user	8	53287	December 13, 2023
How to get API to "take" on certain rules (prompt engineering) Prompting gpt-4	7	2955	December 20, 2023
Tuning the prompt? Prompting	6	2882	December 14, 2023
Optimized prompts that the OP (original prompter) doesn't understand? Prompting gpt-4 , chatgpt , prompt , prompt-engineering	6	2329	September 25, 2023

Prompt Engineering Help for Fuzzy Matching reasoning

Topics: Potential Matches for Data

Message From ChatGPT:

Message From ChatGPT:

Message From You:

Message From ChatGPT:

Related topics