Is there a way to feed semi-structured data into one bot that provides more accurate results, similar to a RAG model? Alternatively, is there a better approach to retrieving answers from tables in the database?
I want to create a chatbot that can provide answers by retrieving data from my resources. I have two types of data: structured data (in tables on the Databricks platform) and unstructured data (PDF, DOC, CSV files, etc.).
I am considering using a Retrieval-Augmented Generation (RAG ) model for unstructured data and creating a SQL agent for structured data (e.g., using a tool like Databricks Genie ).
However, I would like to develop a single model that can answer my queries by retrieving data from both structured and unstructured sources. After testing Genie, I found that SQL-based agents do not perform well because they require detailed information about each table, such as the meaning of columns and the values they contain. Is there any better way to deal with tables?
Thanks in advance!