LangChain + OpenAI API to generate SQL queries and Result

keving · April 27, 2023, 8:08am

I have been using LangChain + OpenAI API in python to convert natural language text into SQL queries and results. However, it always throws an error if the SQL table is large.

I have been following SQL Chain example — 🦜🔗 LangChain 0.0.150 article for above purpose but always throw token limit 4096 issues. Even when I ask count total customers, it still throws an error. There are around 5k customers in my table , it should not be loading them all of them in background but not sure why it throws token limit error.

db_chain = SQLDatabaseChain(llm=llm, database=db, verbose=True)
db_chain.run(“How many customers are there?”)

Any idea what can be done to avoid this? Will pinecone be helpful in this scenario? if so can someone provide an article for the same. I have multiple large databases and I am making connection dynamically based on user login.

jeyraj.1994 · May 7, 2023, 3:55pm

I tried to connect snowflake and facing the same problem

OptimisticPanda · May 7, 2023, 8:16pm

I think it tries to load your entire database schema which exceeds the token limit. Here a couple of ways to work around it

Specify the relevant tables you want to load in your chain where most of your queries would come from
Use the SQLSequentialDatabaseChain which determines which tables to use based on the query. You may still get the token limit errors if the selected table schemas exceed the token limit
Finally, you can reduce the default number of examples during the database connection from 3. This should reduce the tokens asd well.

These workarounds helped me stay within the token limits.

keving · May 15, 2023, 10:49am

Sorry , I am just seeing your replies. I have around 150+ tables and each table has around 100+ columns. When I say only query on customer table, it seems to be working fine but when I say to it, how many invoices generated for xyz customer and specify to use the invoice table as well, it seems to be having token limit issue. I am not even passing a sample row to the request. Maybe its the limitation and nothing more can be done on this it seems. Maybe GPT4 API having 32K token limit might be useful for these kind of scenarios.

OptimisticPanda · May 15, 2023, 11:40am

Okay. Have you tried using an agent for this task? I find that agents are able to query the relevant tables on their own with some good prompt engineering.

Can you share how you’re loading the database/tables into the chain? Perhaps, I can guide from there. I suspect it doing the database load properly may help you get some queries at least.

Cheers

keving · May 18, 2023, 3:10pm

db = SQLDatabase.from_uri(“mssql+pymssql://user1:******@.\SQLEXPRESS/mytestdb”,include_tables=[‘Contact’,‘Customer’],sample_rows_in_table_info=0)
llm = OpenAI(temperature=0.0, max_tokens=350)
chain = SQLDatabaseSequentialChain.from_llm(llm, db, verbose=True,top_k = 1)
chain.run(“How many contact records created for customer kevin”)

This model’s maximum context length is 4097 tokens, however you requested 5368 tokens (5018 in your prompt; 350 for the completion). Please reduce your prompt; or completion length.

ahmed_irfan · May 22, 2023, 5:55am

Hi. Could you please provide any reference links on how to use the SQLSequentialDatabaseChain? I am unable to find a guide on how to use it.
Moreover, if I am using a relational database, will I need to specify all the tables that are expected to be involved in the SQL query using INNER JOINs?

justin5 · June 2, 2023, 3:05pm

How did you format your Snowflake URI?

azharudeen2020 · June 5, 2023, 2:13pm

I have my issue on calling the database it is throwing auth error and giving an empty messae how are you taking the data for input And how are you providing the api key are you setting any environment

RMDG · October 16, 2023, 3:33pm

Where you able to get a solution on your problem?
Since I have a similar problem atm which I’m trying to get a solution

sdshariefsms · October 20, 2023, 7:10am

Where you able to get a solution on your problem?
Since I have a similar problem which I’m trying to get a solution
how to chunk the database as per the token limits or response willl be chunk …
iam try ing the many ways not get the results …
if you find the solution or not ?
are you also stuck this issue

Topic		Replies	Views
Is there any way to chat with my database datas by using chains like SqlDatabaseChain? API gpt-4 , chatgpt , api , langchain , text-davinci-003	2	1375	December 19, 2023
Natural Language to SQL with huge table schema API	12	7206	December 19, 2023
Is there anyone that can help me on implementing a chat with database datas using langchain js and open ai in node js? API gpt-35-turbo , chatgpt , api	1	1798	August 12, 2023
How do i create a custom model for text to sql queries? API	13	10756	December 18, 2023
Reading Huge Database Tables Using LangChain Community chatgpt	2	2657	December 19, 2023

LangChain + OpenAI API to generate SQL queries and Result

Related topics