Fine tuning GPT 4 (max tokens, system prompt, guide)

Hi,

I would like to fine tune GPT4 and almost all documentation/guides is around GPT 3.5. I have some questions:

  1. Where can I find the max_tokens per example for GPT4?

  2. Other than the max_tokens_per_example mentioned here Data preparation and analysis for chat model fine-tuning | OpenAI Cookbook, can this be used as is for GPT 4 fine tuning?

  3. If I fine tune a model and want to add more examples, from this forum looks like it will start giving more weightage to the newer examples. Is this the case with GPT4 also or does the model give equal weights for examples from both training sessions?

  4. If I want structured output- will it suffice to simply have enough examples where the output is always structured, or do I need to do add it to the system message? The thing is that the prompt will be huge and repeated across all examples. The documentation is unclear on this.
    It states:
    “If you would like to shorten the instructions or prompts that are repeated in every example to save costs, keep in mind that the model will likely behave as if those instructions were included, and it may be hard to get the model to ignore those “baked-in” instructions at inference time.” I don’t fully understand what this means? Are they implying that whatever system prompt we use for fine tuning is exactly what we need to give later?

Thanks!
Anika.

Have you requested and been granted access to fine-tune GPT-4, though?

It seems some supplementary information, like even the price, would come in the communications that follow this authorization, limited to very few.

An example chat corresponds to placement in the model context length, training on what the AI would generate after seeing all that is placed. So if your typical application you want to train on can go up to 8k for gpt-4 or up to 125k for gpt-4-turbo, I expect the same would be facilitated in fine-tune.

The output limit of new gpt-4-turbo models is 4k, the actual definition of max_tokens, so training the assistant to produce more would be mostly futile.

Here is the available documentation about token limits per example. It overlooks both base models and gpt-4.


system message should definitely be used, but can be shorter: a new identity and purpose that will identify your fine-tuning distinctly from the pretraining that chat AI models have.

The “baked in” is referring to the AI learning new behaviors by fine-tuning that can’t be easily deviated from if you don’t also train on when those behaviors are to be employed.

2 Likes