Openai text to sql not working

ysomaiya003 · June 27, 2023, 6:22am

Team,

Trying to build Natural Language to sql generator , however both text-davinci-003 and gpt-3.5-turbo not able to generate accurate queries, i am passing proper datamodel in the input prompt,also customizing the input but no luck.

Can anyone suggest what could be next step gpt4? or any option to custom train gpt model?

Foxalabs · June 27, 2023, 6:26am

Welcome to the Forum!

Can you post some examples of your prompts and the replies generated by the model?

ysomaiya003 · June 27, 2023, 6:37am

Below is the input & output, first it was not creating proper query so i added in prompt “use CTE if query is complex” and bingo 90% work done, it only fails because not able to add proper table alias ,but when i added in input prompt do proper aliasing it terribly failed

example input

consider below tables and its properties
time_dimension(order_id integer primary key,order_date string,year integer,quarter integer,month integer)
employee_dimension(emp_id interger primary key,emp_name varchar2(20),title varchar2(20),department varchar2(20),region varchar(10))
sales_fact(order_id integer foreign key,emp_id integer foreign key,total integer,quantity integer,discount integer)
Generate SQL query to display monthwise employee who recorded the highest sales for last 3 months including current month as per Apache Presto syntax and type cast order_date column to date, if query is complex use CTE

example output

WITH highest_sales AS ( SELECT sd.month, sd.year, ed.emp_name, SUM(sf.total) AS total_sales FROM sales_fact sf JOIN time_dimension td ON sf.order_id = td.order_id JOIN employee_dimension ed ON sf.emp_id = ed.emp_id WHERE td.order_date >= DATE_FORMAT(DATE_TRUNC(‘month’, CURRENT_DATE - INTERVAL ‘3’ month), ‘%Y-%m-%d’) AND td.order_date < DATE_FORMAT(DATE_TRUNC(‘month’, CURRENT_DATE), ‘%Y-%m-%d’) GROUP BY sd.month, sd.year, ed.emp_name ), ranked_sales AS ( SELECT month, year, emp_name, total_sales, ROW_NUMBER() OVER (PARTITION BY month, year ORDER BY total_sales DESC) AS rank FROM highest_sales ) SELECT month, year, emp_name, total_sales FROM ranked_sales WHERE rank = 1 ORDER BY year, month

Foxalabs · June 27, 2023, 7:39am

Is this with GPT-4 or 3.5? GPT-4 is more capable of following longer sets of instructions.

Additionally, you may be asking too many steps in one prompt. There are inference depth limits with models.

ysomaiya003 · June 27, 2023, 4:25pm

Ohh…its gpt-3.5-turbo-16k model, will surely try with gpt-4.

since gpt-4 is a paid one was trying to exercise caution and also wanted to check if fine tuning will work

But Definitely this information helps

Topic		Replies	Views
Seeking Advice on Fine-Tuning NLP Model for Response Generation Community gpt-35-turbo	3	97	February 24, 2025
Text to SQL generation API	14	12164	December 24, 2023
Converting Natural language to SQL query - Which model is best GPT-3.5 or GPT-4 API	3	3100	October 10, 2023
How do i create a custom model for text to sql queries? API	13	11800	December 18, 2023
How to get the constant answer for the same prompt while converting natural language to SQL query API	1	705	October 12, 2023

Openai text to sql not working

Related topics