Here’s a rephrased version of your text:
“I am developing an AI chatbot application that generates SQL queries in response to natural language inputs. The database schema I’m working with consists of 670 tables, resulting in a total of nearly 17,24,789 tokens. Due to the model’s context size limitations, I am unable to pass the entire schema with every query. I’m looking for guidance on the best approach to handle this issue effectively.”
1 Like
Hi @vinchurkarp77 and welcome to the community ! Text to SQL is a super active area of research. Just glancing at your problem, I would say you are dealing with too many tables at once for this to be effective. I would try to cluster the problem, so you would have different clusters of tables. Joins become a tricky thing here of course. My advice is to start small with 2-3 tables and test it out first. And normally you have to do a lot more then just feed the schemas, but as I said, it’s a very active area of research.
1 Like
Thank you for your interest in this. I’ve successfully tested this approach with smaller database schemas containing 4-6 tables, and it worked well. However, for such a large database schema, I’m struggling to find a suitable approach to handle it effectively.
1 Like