My organization has taken 6 months subscription for gpt-3.5-turbo-0613 with rpm 3500 api calls and tpm limit 90000 tokens.
We are using fine tuned models for our enterprise work. We currently have 2 fine tuned models performing different work and api calls are made to them for different use cases.
I would like to know if we will be charged less if we use only 1 fine tuned model (that would mean somehow training gpt to perform both use cases from a single fine tune) or are charges independent of how many fine tuned models we use ?
I know that there are extra charges for training and fine tuning, but are there extra charges for api calls made to these models?
The use of the fine-tune model is charged based on input and output tokens. It wouldn’t cost more to obtain the same product from the same data whether you had 10 models in your account.
In fact, combining the two models into one (such as with two completely different system identities) could mean that you have to prompt more to get to that specific identity, and the weights of your single application would be confused by the other training.
Then there is the actual training itself. If you have 200 examples for either job, it costs the same whether you feed 200 into two different models or 400 into a single model to fine-tune a combo.
So unless there is desired similarity (like a chatbot that can both answer sales questions and provide customer support and both could be user inputs that could be answered better with a single model) then there is a good case for the separate models at no higher cost.