How do i create a custom model for text to sql queries?

tonypaul.006 · May 12, 2023, 6:58pm

I am planning to build a chatbot that works on specific types of data sets. The data set contains prices, promotions, and other details from a few websites, and the data is updated periodically in a Postgres database. How do I train a model where the customer’s natural language questions can be converted into SQL queries and then visualize it? Ideally, it generates an SQL query that I can run on something like metabase or redash and get charts and graphs. The questions can be like “How many products are out of stock today for each competitor on Amazon”, “How many products had a more than 10% price drop” etc.

I’d really appreciate any help

codie · May 12, 2023, 7:10pm

Sounds like a pretty cool project.

You could try to make an instruction-based prompt for 100% LLM solution like:

You are an assistant that has access to a SQL database. The database has 3 columns: name, price, quantity. Please create a SQL script that fulfills the following request:

Request: {{input}}

That’s one way. The other way is to make a bunch of common functions and use something like langchain to chain the AI instructions together. That would be the more stable version that stays stable throughout AI upgrades. Only downside is you will have to make a few functions here and there. But you could add the most common first, then add more over time.

tonypaul.006 · May 12, 2023, 7:17pm

Thanks for the suggestion. Let me try the first solution and see how it works. I already have a set of natural language questions and their corresponding SQL queries. Let me train and see how it works. I think I also need to build a feedback system. It means once I pass open ai-generated SQL to metabase API, and it causes an error ( because the query was wrong). Then it should regenerate a query and try again. Also, if there is a way to explain the query in natural language, that would be better, too, as it will help the user improve his prompting and get better results.

novaphil · May 12, 2023, 7:18pm

Related thread A sanity check for future plugins to access private SQL databases

codie · May 12, 2023, 7:45pm

You could put the code in a try catch statement:

def do_sql_stuff(sql_string_from_gpt):
    try:
        response = cursor.execute(sql_string_from_gpt)
        return response, True
    except Exception e:
        return e, False

def summarize_sql(prompt, command=None, error=None)
    full_prompt = prompt
    if error  != None:
        full_prompt = f"This ERROR occurred while fulfilling this REQUEST with this COMMAND. [REQUEST]{full_prompt}[END REQUEST]\n[COMMAND]{command}[END COMMAND]\n[ERROR]{error}[END ERROR]\nPlease correct the error in the SQL command."
    
    sql_command= ask_ai(full_prompt)
    response, success = do_sql_stuff(sql_command)
    return response, sql_command, success

Where the results from the summarize_sql will go through some sort of logic to determine if you want to retry again. Definitely make sure there is an attempt counter.

dror · May 12, 2023, 9:37pm

Seems like such an obvious problem that you’d think someone would have created this already. Oh yeah, I did .

This is a command line tool to get you data from postgres.
The prompt is quite simple: ria/pagila2 at main · drorm/ria · GitHub
and the relevant code is in: ria/db.ts at main · drorm/ria · GitHub
The only bit where I can claim any cleverness is the way I generate the schema description for GPT. A real full size schema can get huge really quickly. I did some hacks to reduce the size.
Note that this was a quick and dirty project, so take everything with a grain of salt.

tonypaul.006 · May 13, 2023, 1:46pm

This is cool. I’ll try this and see how to make it work

tonypaul.006 · May 13, 2023, 1:48pm

Thanks @codie . I’ll try this and let you know what the progress is

codie · May 14, 2023, 5:31am

Hey Tony,

I was doing some research for a project I am building and I accidently found out langchain has a SQL Chain already built out. It came up randomly and is not related to what I am doing but I was reminded of this post. So, I just wanted to hit you up and let you know.

eduardosg · May 23, 2023, 9:00am

Sounds interesting from an Amazon Vendor’s perspective. If you are down to chat and dig deeper into this idea/product, feel free to reach out on Tgram at @exascaleESG.

abhi3hack · June 2, 2023, 10:04am

hey @codie
I have tried using this langchain and I have described my project descriptions also
Can you please have a look at the query I posted : Creating a Chatbot using the data stored in my huge database

I guess you can help me in this

abhi3hack · June 2, 2023, 12:09pm

Hey @dror
I have tried to make a similar project
I have raise a query here : Creating a Chatbot using the data stored in my huge database

can you please have a look and guide me in some way
and may I know what is the accuracy level of the your English to SQL terminal converter you have made
Because I dont see much any lengthy prompts or so making the open ai understand to give a sql query which many of the online resources have suggested and which I have also tried.

dror · June 5, 2023, 6:08pm

Hey @abhi3hack,

As I mentioned above,

The prompt is quite simple: ria/pagila2 at main · drorm/ria · GitHub
and the relevant code is in: ria/db.ts at main · drorm/ria · GitHub

It worked on a variety of requests I tried, but I haven’t really tried to make this into a product. it was more of an experiment.

Topic		Replies	Views
Using GPT3 on database schamas Prompting	16	5241	April 12, 2023
Turning chatgpt API into a assistant for a (complex) website API	20	4279	December 21, 2023
Seeking Guidance on Building a ChatGPT-Style Data Analyst Tool with Database Integration Plugins / Actions builders gpt-4 , chatgpt , api , openai	11	4561	September 23, 2024
Text to SQL generation API	14	12243	December 24, 2023
Converting natural language to SQL query API api , open-llm	9	18469	December 18, 2023

How do i create a custom model for text to sql queries?

Related topics