How to do Fine Tune the Grade School Math contents?

ashoktcr · March 10, 2023, 9:33am

Hello,

I am newbie in OpenAI, and I old like to learn fine tuning.

While I going through the Fine Tune documentation, I can see the a good example of fine tuning the Math problem - Grade School Maths . On the GitHub, I can see that train.jsonl file, and I just try to fine tune with the command
openai api fine_tunes.create -t grade_school_math/data/train.jsonl -m ada

Then I got
Error: Expected file to have JSONL format with prompt/completion keys. Missingprompt key on line 1. (HTTP status code: 400)

When I see the json file, the it have
Question,
Answer with formula and then
Correct Answer

{“question”: “Natalia sold clips to 48 of her friends in April, and then she sold half as many clips in May. How many clips did Natalia sell altogether in April and May?”, “answer”: “Natalia sold 48/2 = <<48/2=24>>24 clips in May.\nNatalia sold 48+24 = <<48+24=72>>72 clips altogether in April and May.\n#### 72”}

I would like to know how to train this JSONL ?

The official documentation about fine tuning here explaining about “Prompt” and “Completion”

Please help me

ruby_coder · March 10, 2023, 9:44am

Hi @ashoktcr

You should post the contents of the file above, using Markdown triple back ticks, as follows:

```
# contents of your training file here
```

Then, we can start to assist you.

linus · March 10, 2023, 9:44am

Hi ashokter,

in my opinion, it is because you are using “answer” instead of “completion” in your training data. If you change this in your training-data it should work.

Please refer to the example provided on Fine-tuning - OpenAI API:

{"prompt":"<Product Name>\n<Wikipedia description>\n\n###\n\n", "completion":" <engaging ad> END"}

Here there is also the term “completion” used.

Best regards,
Linus

ashoktcr · March 10, 2023, 9:46am

First few line of the file as below,

{"question": "Natalia sold clips to 48 of her friends in April, and then she sold half as many clips in May. How many clips did Natalia sell altogether in April and May?", "answer": "Natalia sold 48/2 = <<48/2=24>>24 clips in May.\nNatalia sold 48+24 = <<48+24=72>>72 clips altogether in April and May.\n#### 72"}
{"question": "Weng earns $12 an hour for babysitting. Yesterday, she just did 50 minutes of babysitting. How much did she earn?", "answer": "Weng earns 12/60 = $<<12/60=0.2>>0.2 per minute.\nWorking 50 minutes, she earned 0.2 x 50 = $<<0.2*50=10>>10.\n#### 10"}
{"question": "Betty is saving money for a new wallet which costs $100. Betty has only half of the money she needs. Her parents decided to give her $15 for that purpose, and her grandparents twice as much as her parents. How much more money does Betty need to buy the wallet?", "answer": "In the beginning, Betty has only 100 / 2 = $<<100/2=50>>50.\nBetty's grandparents gave her 15 * 2 = $<<15*2=30>>30.\nThis means, Betty needs 100 - 50 - 30 - 15 = $<<100-50-30-15=5>>5 more.\n#### 5"}
{"question": "Julie is reading a 120-page book. Yesterday, she was able to read 12 pages and today, she read twice as many pages as yesterday. If she wants to read half of the remaining pages tomorrow, how many pages should she read?", "answer": "Maila read 12 x 2 = <<12*2=24>>24 pages today.\nSo she was able to read a total of 12 + 24 = <<12+24=36>>36 pages since yesterday.\nThere are 120 - 36 = <<120-36=84>>84 pages left to be read.\nSince she wants to read half of the remaining pages tomorrow, then she should read 84/2 = <<84/2=42>>42 pages.\n#### 42"}

ashoktcr · March 10, 2023, 9:48am

Yes, I have seen the example explain with Prompt and Completion.
But in the GitHub file it is Question and Answers

ruby_coder · March 10, 2023, 9:48am

ashoktcr:

{"question": "Julie is reading a 120-page book. Yesterday, she was able to read 12 pages and today, she read twice as many pages as yesterday. If she wants to read half of the remaining pages tomorrow, how many pages should she read?", "answer": "Maila read 12 x 2 = <<12*2=24>>24 pages today.\nSo she was able to read a total of 12 + 24 = <<12+24=36>>36 pages since yesterday.\nThere are 120 - 36 = <<120-36=84>>84 pages left to be read.\nSince she wants to read half of the remaining pages tomorrow, then she should read 84/2 = <<84/2=42>>42 pages.\n#### 42"}

Hi @ashoktcr

As @linus mentioned, your training file is not formatted correctly.

HTH

linus · March 10, 2023, 9:49am

Can you try the following code:

‘{“prompt”: “Natalia sold clips to 48 of her friends in April, and then she sold half as many clips in May. How many clips did Natalia sell altogether in April and May?”, “completion”: “Natalia sold 48/2 = <<48/2=24>>24 clips in May.\nNatalia sold 48+24 = <<48+24=72>>72 clips altogether in April and May.\n#### 72”}
{“prompt”: “Weng earns $12 an hour for babysitting. Yesterday, she just did 50 minutes of babysitting. How much did she earn?”, “answer”: “Weng earns 12/60 = $<<12/60=0.2>>0.2 per minute.\nWorking 50 minutes, she earned 0.2 x 50 = $<<0.250=10>>10.\n#### 10"}
{“question”: “Betty is saving money for a new wallet which costs $100. Betty has only half of the money she needs. Her parents decided to give her $15 for that purpose, and her grandparents twice as much as her parents. How much more money does Betty need to buy the wallet?”, “completion”: "In the beginning, Betty has only 100 / 2 = $<<100/2=50>>50.\nBetty’s grandparents gave her 15 * 2 = $<<152=30>>30.\nThis means, Betty needs 100 - 50 - 30 - 15 = $<<100-50-30-15=5>>5 more.\n#### 5”}
{“prompt”: “Julie is reading a 120-page book. Yesterday, she was able to read 12 pages and today, she read twice as many pages as yesterday. If she wants to read half of the remaining pages tomorrow, how many pages should she read?”, “completion”: “Maila read 12 x 2 = <<12*2=24>>24 pages today.\nSo she was able to read a total of 12 + 24 = <<12+24=36>>36 pages since yesterday.\nThere are 120 - 36 = <<120-36=84>>84 pages left to be read.\nSince she wants to read half of the remaining pages tomorrow, then she should read 84/2 = <<84/2=42>>42 pages.\n#### 42”}’

Edit: Changed the first question → Prompt

Topic		Replies	Views
Fine-tuning problem, multiple completion Prompting	2	1773	December 25, 2023
Trying To Fine-Tune To Overcome Prompt Size Limit API	4	1451	December 17, 2023
How to include general instructions in a jsonl file for use in fine-tuning the OpenAI Davinci model? API	5	1038	December 25, 2023
How to give common instructions to fine-tuning in addition to jsonl data? API	1	381	December 25, 2023
Any small sample jsonl that includes system prompt as well? Documentation api	4	79	March 31, 2025

How to do Fine Tune the Grade School Math contents?

Related topics