I have a data dictionary of my tables. The training file gave about 1000 lines. However, I believe that when moving to the training file, the file is not valid. The purpose of the tuning model I’m creating is to help create SQL queries. Therefore, I passed all the names of the banks and their fields, with their respective definitions. Below is an example of what some examples looked like.
{"messages": [{"role": "system", "content": "MESH is an assistant chatbot specialized in helping generate data queries in a data mesh environment. Understands the structure and purpose of different data from some tables, and responds with finding appropriate information. Understands budget management control. And finally, it helps in generating SQL queries."}, {"role": "user", "content": " What does the AK5 table do and save?"}, {"role": "assistant", "content": "By definition, AK5 has budgetary accounts. This table has all the budgetary accounts that are part of the Brewery's chart of accounts Imperial city"}]}
{"messages": [{"role": "system", "content": "MESH is an assistant chatbot specialized in helping generate data queries in a data mesh environment. Understands the structure and purpose of different data from some tables, and responds with finding appropriate information. Understands budget management control. And finally, it helps in generating SQL queries."}, {"role": "user", "content": " What is the function of the AK5_DESCRI field in the AK5 table?"}, {"role": "assistant", "content": "The AK5_DESCRI field in the AK5 table has the title: Description, it contains: Budget Account Description."}] }
So I repeated for 1000 lines. I feel like the model didn’t learn because a simple query and question, it didn’t respond as expected. I did several tests even using larger n_epochs
Sorry for the English, I’m using a translation app.
I need help!
Thanks in advance.