Why does a 1115 length fine-tuning model file costs 1,520 trained tokens?

I create a fine-tuning model to train GPT to remove useless and expired feature flags. I follow the fine-tuning instructions from fine-tuning documentation. I create a jsonl file which containing the code below (which contains 1115 characters):

{ “prompt”: “Remove all feature flag with key ff-key-1 in following code: \r\r csharp \r var options=new FbOptionsBuilder().Offline(true).Build();var client=new FbClient(options);if(!client.Initialized)Console.WriteLine(\"FbClient failed to initialize. Exiting...\"); else {var user=FbUser.Builder(\"anonymous\").Build();var a=client.BoolVariation(\"ff-key-1\",user,defaultValue:false);if(a==true) Console.Write(\"ff-key-1: true\");if(a==false){var b=client.StringVariation(\"ff-key-2\",user,defaultValue:\"on\");if(b==\"on\"){Console.WriteLine(\"ff-key-2:on\");}}}”, “completion”: "csharp \r using FeatBit.Sdk.Server;using FeatBit.Sdk.Server.Model;using FeatBit.Sdk.Server.Options;var options = new FbOptionsBuilder().Offline(true).Build();var client = new FbClient(options);if (!client.Initialized) Console.WriteLine(\"FbClient failed to initialize. Exiting...\"); else { var user = FbUser.Builder(\"anonymous\").Build(); Console.Write(\"ff-key-1: true\"); var ff2 = client.StringVariation(\"ff-key-2\", user, defaultValue: \"on\"); if (ff2 == \"on\") { Console.WriteLine(\"ff-key-2:on\"); } \r " }

I tokenized the above code using the online tool OpenAI API Tokenizer. It told me that the above code will consume 388 tokens.

But after the fine-tuning was completed with the code above, it said that this fine-tuning cost 0.05 USD and used 1,520 trained tokens.

I want to know why does a 1115 length fine-tuning model file costs 1,520 trained tokens.


Here’s the command I used to create fine-tune:

openai api fine_tunes.create -t featbit-fine-tune-rm-ff-beta-001.jsonl -m davinci

Welcome to the forum.

It’s because default is a few epochs, I believe. It should give you a price estimate before the fine-tune begins.

Hope this helps.

1 Like

Thank you @PaulBellow

It gave me a price estimate before fine-tuning, but I just don’t understand why its estimated price is so different from the one described on OpenAI’s documentation and pricing page.

I think I understood what you mean, I asked GPT 4, he gave me the anwser below:

In this case, you used a training dataset with a JSONL file containing 1,115 characters. You converted these characters into 388 tokens using the OpenAI API Tokenizer. However, during the fine-tuning process, you consumed 1,520 training tokens. This discrepancy may be due to the following reasons:

  1. Input and output tokens: When calculating training tokens, you need to consider the token count for both the input (prompt) and output (completion). Assuming the input and output tokens are equal, you would be billed for 776 tokens (388 * 2).

  2. Additional tokens during training: During the fine-tuning process, the model may add some extra tokens between the input and output tokens, such as special separators or other tokens indicating the model’s state. These extra tokens are also included in the total training tokens count.

  3. Batch size and epoch count: If you used multiple epochs during the training process, the total training tokens count would increase with the number of epochs. Additionally, if you used batches for training, the token count in each batch would also affect the total training tokens count.

In summary, the 1,520 training tokens may include input, output tokens, extra tokens added by the model during training, and additional tokens due to the batch size and epoch count settings. To ensure that your fine-tuning costs align with your expectations, carefully review your training settings and make sure the epoch count, batch size, and other parameters are set appropriately.

If I set epoch to 1, the cost has been reduced. But I need to measure the quality if I fine-tuning with only 1 epoch.