How do i create a custom model for text to sql queries?

I am planning to build a chatbot that works on specific types of data sets. The data set contains prices, promotions, and other details from a few websites, and the data is updated periodically in a Postgres database. How do I train a model where the customer’s natural language questions can be converted into SQL queries and then visualize it? Ideally, it generates an SQL query that I can run on something like metabase or redash and get charts and graphs. The questions can be like “How many products are out of stock today for each competitor on Amazon”, “How many products had a more than 10% price drop” etc.

I’d really appreciate any help

1 Like

Sounds like a pretty cool project.

You could try to make an instruction-based prompt for 100% LLM solution like:

You are an assistant that has access to a SQL database. The database has 3 columns: name, price, quantity. Please create a SQL script that fulfills the following request:

Request: {{input}}

That’s one way. The other way is to make a bunch of common functions and use something like langchain to chain the AI instructions together. That would be the more stable version that stays stable throughout AI upgrades. Only downside is you will have to make a few functions here and there. But you could add the most common first, then add more over time.

1 Like

Thanks for the suggestion. Let me try the first solution and see how it works. I already have a set of natural language questions and their corresponding SQL queries. Let me train and see how it works. I think I also need to build a feedback system. It means once I pass open ai-generated SQL to metabase API, and it causes an error ( because the query was wrong). Then it should regenerate a query and try again. Also, if there is a way to explain the query in natural language, that would be better, too, as it will help the user improve his prompting and get better results.

1 Like

Related thread A sanity check for future plugins to access private SQL databases

1 Like

You could put the code in a try catch statement:

def do_sql_stuff(sql_string_from_gpt):
    try:
        response = cursor.execute(sql_string_from_gpt)
        return response, True
    except Exception e:
        return e, False

def summarize_sql(prompt, command=None, error=None)
    full_prompt = prompt
    if error  != None:
        full_prompt = f"This ERROR occurred while fulfilling this REQUEST with this COMMAND. [REQUEST]{full_prompt}[END REQUEST]\n[COMMAND]{command}[END COMMAND]\n[ERROR]{error}[END ERROR]\nPlease correct the error in the SQL command."
    
    sql_command= ask_ai(full_prompt)
    response, success = do_sql_stuff(sql_command)
    return response, sql_command, success

Where the results from the summarize_sql will go through some sort of logic to determine if you want to retry again. Definitely make sure there is an attempt counter.

Seems like such an obvious problem that you’d think someone would have created this already. Oh yeah, I did :slight_smile: .

This is a command line tool to get you data from postgres.
The prompt is quite simple: ria/pagila2 at main · drorm/ria · GitHub
and the relevant code is in: ria/db.ts at main · drorm/ria · GitHub
The only bit where I can claim any cleverness is the way I generate the schema description for GPT. A real full size schema can get huge really quickly. I did some hacks to reduce the size.
Note that this was a quick and dirty project, so take everything with a grain of salt.

This is cool. I’ll try this and see how to make it work

Thanks @codie . I’ll try this and let you know what the progress is

Hey Tony,

I was doing some research for a project I am building and I accidently found out langchain has a SQL Chain already built out. It came up randomly and is not related to what I am doing but I was reminded of this post. So, I just wanted to hit you up and let you know.

1 Like

Sounds interesting from an Amazon Vendor’s perspective. If you are down to chat and dig deeper into this idea/product, feel free to reach out on Tgram at @exascaleESG.

hey @codie
I have tried using this langchain and I have described my project descriptions also
Can you please have a look at the query I posted : Creating a Chatbot using the data stored in my huge database

I guess you can help me in this

2 Likes

Hey @dror
I have tried to make a similar project
I have raise a query here : Creating a Chatbot using the data stored in my huge database

can you please have a look and guide me in some way
and may I know what is the accuracy level of the your English to SQL terminal converter you have made
Because I dont see much any lengthy prompts or so making the open ai understand to give a sql query which many of the online resources have suggested and which I have also tried.

Hey @abhi3hack,

As I mentioned above,

The prompt is quite simple: ria/pagila2 at main · drorm/ria · GitHub
and the relevant code is in: ria/db.ts at main · drorm/ria · GitHub

It worked on a variety of requests I tried, but I haven’t really tried to make this into a product. it was more of an experiment.