Training data not getting desired fine tune output

I am fine tuning the curie model to produce mysql queries from natural language questions. i created a training data set of roughly 100 examples. they have the the following format:
{“prompt”: “write me a mysql query to to answer the following question: ?\n”, “completion”: " ;\n"}
I feedthe db schema and table info in the prompt parameter when i call my model.

Im not getting the desired results in some cases. Am i going about training incorrectly? Is there a better strategy to get the results I want?

That is far, far, far too few training samples. You’ll need at least 500–1000 training examples to begin to get good results out of fine-tuning.

got it. how necessary is fine tuning for my use case? can i just train my model on the database schema?

Do you reccomend some other way to go about this?

Honestly, using a fine-tuned curie model is about 8 times as expensive as using gpt-3.5-turbo, so I’d probably just use that with a good system prompt and a few-shot learning example if needed.


The model is pretty good at SQL, because SQL was well documented on the internet before 2021. I think providing the schema of the appropriate tables, and the intended result of the query, might get gpt-4 (and maybe even gpt-3.5) to largely generate good queries.

Note that you won’t always get correct queries. The model may either misunderstand the intent, or it may generate subtly (or not so subtly) broken SQL. I highly recommend running some sanity check on the returned code, and if it doesn’t quite work right, re-prompt the model with a continuation, something like “here’s my schema, here’s what I want to do, here’s a broken query: (SQL); please fix the query.”

Beware that, the bigger your schema becomes, the more likely the model is to get it wrong, because it can only “pay attention” to so many details at once.

Or far more epochs, or running that tuning data again to double the weight.

Do you really think more training on 100 exemplars is going to help? My suspicion is it would quickly overfit (if it isn’t already).

I was also thinking though that the other thing the OP could do would be to give it access to an actual MySQL instance and build, essentially, a rudimentary “SQL Interpreter.”

Given the schema and prompt,

  1. Create the DB
  2. Fill it with dummy data
  3. Create the query from the prompt
  4. Run the query
  5. Interpret the results
  6. If the results are incorrect update the query and GOTO 4, else continue
  7. Respond to the user with the query