I feel like my tuning model isn't "learning"

I have a data dictionary of my tables. The training file gave about 1000 lines. However, I believe that when moving to the training file, the file is not valid. The purpose of the tuning model I’m creating is to help create SQL queries. Therefore, I passed all the names of the banks and their fields, with their respective definitions. Below is an example of what some examples looked like.

{"messages": [{"role": "system", "content": "MESH is an assistant chatbot specialized in helping generate data queries in a data mesh environment. Understands the structure and purpose of different data from some tables, and responds with finding appropriate information. Understands budget management control. And finally, it helps in generating SQL queries."}, {"role": "user", "content": " What does the AK5 table do and save?"}, {"role": "assistant", "content": "By definition, AK5 has budgetary accounts. This table has all the budgetary accounts that are part of the Brewery's chart of accounts Imperial city"}]}
{"messages": [{"role": "system", "content": "MESH is an assistant chatbot specialized in helping generate data queries in a data mesh environment. Understands the structure and purpose of different data from some tables, and responds with finding appropriate information. Understands budget management control. And finally, it helps in generating SQL queries."}, {"role": "user", "content": " What is the function of the AK5_DESCRI field in the AK5 table?"}, {"role": "assistant", "content": "The AK5_DESCRI field in the AK5 table has the title: Description, it contains: Budget Account Description."}] }

So I repeated for 1000 lines. I feel like the model didn’t learn because a simple query and question, it didn’t respond as expected. I did several tests even using larger n_epochs

Sorry for the English, I’m using a translation app.

I need help!
Thanks in advance.

  1. use the same long system prompt in your software.
  2. try user questions the same as the file, and try similar questions to see how much the style is emulated;
  3. fine-tuning is not for knowledge retrieval. You ask a question similar to a bank question, you are going to get only the same general idea as an answer.
  4. OpenAI has not warned you away from applications that are unsuitable for fine-tune in their documentation.
2 Likes

Do you have any tips on how to create a prompt so that the model knows and understands my SQL tables?

Thanks for the answer

The general advice for OpenAI (or Azure OpenAI) fine-tuning is to use it only for style and format, not new knowledge. To finetune with new knowledge you may have better results finetuning other models with own or cloud hardware. The key hyperparameters is to have a high rank, which is not available while training OpenAI models. That means it will not be easy to make the models learn new knowledge.

As _j said it is also important to use the exactly same prompt when using the fine-tuned model. Because it should learn the style and structure of output you do not need to give the system prompt in your training data, so you can also run the model faster and cheaper without giving the system prompt.

In this case, because finetuning is not for learning knowlege, you may need to give the table structure as part of the system or user prompt. The good side of this approach is that your finetuned model should then work with all kinds of database schemas, or at least with minor updates to your db schema.

1 Like