How to enforce rule-set with fine tuning?

Hello, I have been playing a little bit with the API and was wondering how to do optimized fine-tuning.

Basically, I want to train the gpt-3.5 Turbo to communicate via JSON input / output and to follow some rules. Those rules theoritically fit in 2000 tokens which is already to much to specify everytime.
Regarding input / outputs they have some constraints (eg. when returning a number, it must be between 0 and 100).

Now I went directly into fine tuning and there arise some interogations:

  • It seems like we need to specify the system prompt (so I guess the “rules”) every time which is not compatible with its length. Is there a way to overcome this ?
  • My secondary thought was to simply the rule to 100 / 200 tokens and train as much as i can with a lot of inputs and outputs so it knows all type. This forces me to use quite longer prompt than necessary.
  • My last idea was to simply provide input and output without constraints and enforce those constraint in the code instead of with the IA, seems also doable.

I’m wondering if this is the best idea to go this way for this project since one user session will be something like 3000 tokens.

The model can already understand JSON, so if you feed the input as is and then specify a function, i.e. the new functions feature, your output will always been in JSON format that will be in your function definition JSON format.

Thank you for your answer !

In this scenario, I would not need to use fine tuning at all (since it’s not compatible with function calling yet) ?

In this case how about the generic rules I need to apply ?
How does tokenize work with function calls ?

I suppose it depends on how detailed your rules are, if the majority of your 2000 tokens is being used on those rules and not just on making it output json then you may be better of with a fine tune, you’d need to weigh up the costs of sending the tokens vrs paying 8x on the fine tuned model, all depends of which of those works out cheaper and has the best performance.

1 Like