Hi! I am currently trying to transform my schema into a format that works well for davinci-002 and ask the model to translate Natural Language Requests to SQL queries. It works well for small database sizes but when I try to fit in my corporate DB structure, even the most dense format produces above 20k tokens (!) and there is no way i can reduce it much. Over time it will grow. That is the general format of my completion prompt:
### POSTGRES SQL tables, with their properties: # table1 (column1.1, column1.2) # table2 (column2.1, column2.2) ... INPUT: give me all users OUTPUT: SELECT * from users ...
So I have searched this forum for a solution but did not find anything yet. Somewhere someone stated that fine-tuning is not the way to go when training my model. Did anyone else try the same and succeeded?
Ideally I will tell OpenAI how my db looks like and on every completion request, I only send over the minimum amount of context needed: the actual user request