What is the token limit while fine tuning gpt3 including all prompts and completion

I am fine tuning gpt3 for generating financial reports. I have long paragraphs in each prompts and completion pair. I am only able to add 2 or 3 examples in gpt3 fine tuning, it gives the error of maximum token allowed are 2049. Is it for every example or including all examples?

1 Like

I had this question as well. It seems that there’s a 2048 token limit on each prompt but not sure about the whole tuning set.

Since the 2048 is a technical limitation, would there also be a technical limitation on the side of the whole tuning set?

AFAIK each example is limited to the same size a normal prompt + completion would be for the base model.

Batches are capped at 256 examples per batch.

But I don’t think there’s a limit on total amount of training data overall.

Thanks for chiming in. So should I understand that you can tune a model with multiple batches?

256 seems a bit low, cuz I’ve heard that closer to 500 examples is best.

Right, the limit is just on how many examples you can send up in one batch - not how many batches (or total examples).

I would say that out of the box the max training file size would be way above 100k prompts… And then you can send many of those to version your fine tune… So I don’t see any limit (except the budget)