What is your approach for identifying categories when you do Text to SQL? A super simple example, you have a table of musical records, and one of the fields is TypeId, which references a Category table that has all the types (Rock, Pop, Rap…)
When a user asks: give me a list of all rock&pop records, the LLM will join the category table and will say where Title =‘rock&pop’ where ideally it should be Title in (‘rock’,‘pop’) + of course a typos in user request. In our app, types are much longer and have different aberrations. Any suggestions on how to tackle this problem?
Not an expert on Prompting but this is how I will go about it. I will first, with the Chat Completion request send additional System Role Prompt where I will tell the Model that these are the only Type of Music we have in the table. Like
Rap
Rock
Pop
Then, I will add in instruction that, Do not construct a SQL that queries outside these types. Also, add that whenever there is a type which is not mentioned in the instruction, see if it is combination of two different types and if is, divide that type into two acceptable ‘types’ and use them in “IN” operator in SQL.
You’ll have to use better prompting with correct roles and you’ll be able to achieve it.
That is very interesting. Thank you for giving me an idea.
What I’m going to try is to send the request to pre-prompt, pass on all possible categories and ask the LLM to re-write the prompt to so it uses the most matching category, plus what you said re splitting ,and only then to pass it to Text to SQL