Text 2 SQL - Schema too big to fit into completion API request

inaplay · January 13, 2023, 3:27pm

Hi! I am currently trying to transform my schema into a format that works well for davinci-002 and ask the model to translate Natural Language Requests to SQL queries. It works well for small database sizes but when I try to fit in my corporate DB structure, even the most dense format produces above 20k tokens (!) and there is no way i can reduce it much. Over time it will grow. That is the general format of my completion prompt:

### POSTGRES SQL tables, with their properties:

# table1 (column1.1, column1.2)
# table2 (column2.1, column2.2)
...

INPUT: give me all users
OUTPUT: SELECT * from users

...

So I have searched this forum for a solution but did not find anything yet. Somewhere someone stated that fine-tuning is not the way to go when training my model. Did anyone else try the same and succeeded?

Ideally I will tell OpenAI how my db looks like and on every completion request, I only send over the minimum amount of context needed: the actual user request

Cheers!

mihail1 · November 14, 2023, 10:22pm

You can check NLSQL. com approach. They are doing it well with reliable approach, However, some prerequisites required such as first database structure emulation and providing business logic for KPIs calculations

vinchurkarp77 · February 6, 2025, 8:36am

that might be creating the error for exhaust of token size

Topic		Replies	Views
Natural Language to SQL with huge table schema API	12	9254	December 19, 2023
How to fine tune text to sql? API	19	9162	April 26, 2024
Seeking Guidance on Building a ChatGPT-Style Data Analyst Tool with Database Integration Plugins / Actions builders gpt-4 , chatgpt , api , openai	11	4651	September 23, 2024
Generate SQL queries combining prompt engineering and fine-tuning API	4	8846	December 24, 2023
LangChain + OpenAI API to generate SQL queries and Result API langchain	11	15889	December 19, 2023

Text 2 SQL - Schema too big to fit into completion API request

Related topics