Token Count for Fine-tuning

Asheq · March 17, 2023, 9:09pm

Hi,

I submitted a fine-tuning job today with davinci as the base model. My token count estimate was 1,431,706. But after finishing the job, in account daily usage breakdown, it is listed as 5,726,824 trained tokens which is 4x my estimate.

The result doc at the end of the job shows 11,806,768 as the elapsed_tokens (with repeats) in last row. Using the default 4 epochs, that would likely mean 2,951,692 tokens per epoch, which is 2x my estimate so does not match, 0.5x of what is shown in the account so also does not match.

Can anyone offer more clarity into this? It has been 5 hours since I sent the message to their help and no response still so thought I would try out the community forum too. Charge estimate is also showing up as 4x my estimate. Does that mean rate needs to be multiplied by epochs? No information or page I saw suggest that cost needs to be multiplied by epochs.

(Unrelated, someone from OpenAI should really hook up GPT-4 to this forum to make search for answers painless in case it already exists in the forum)

Thanks!

ruby_coder · March 18, 2023, 1:07am

That might (possibly) be because the default n_epochs value for fine-tuning a model is 4.

This means when you submit fine-tuning data for processing OpenAI processes that data four times, based on the default n_epochs value, so if your training data is, for example 1000 tokens, you will be charged for 4000 tokens (with n_epochs set to the default 4).

HTH

mobeen.asif · July 17, 2023, 12:40pm

What if we set n_epochs value to 2 ? Will it compromise modal quality ?

curt.kennedy · July 18, 2023, 3:24am

If you have “lots” of prompt/completion pairs, that are all similar but slightly different, then you could probably get away with 2 epochs.

However, this is a hyper parameter, so it’s not an exact science, but the AI does need to iterate over the data multiple times to converge and “learn” what you are showing it in your training file.

You want to avoid the low data and lots of epochs, because it could lead to over-fitting … aka, the AI will repeat itself … sometimes this is good though, just beware, but normally this is bad.

Topic		Replies	Views
Why does a 1115 length fine-tuning model file costs 1,520 trained tokens? API	3	1084	March 29, 2023
Finetuning costs not as expected. Difference of about factor 11 API	2	1162	January 9, 2023
Doesn't Understand fine tuned model cost API	13	7334	June 30, 2024
CLI Fine-Tune Error: Hard Billing Limit Exceeded API	9	2025	May 17, 2023
Fine-tune tokens lower than expected API fine-tuning , token , fine-tuning-problems	4	1041	December 8, 2023

Token Count for Fine-tuning

Related topics