As I understand it, fine-tuning uses a completely fresh/untrained model rather than extending of an existing trained one, is that correct?
I have some input/output that is sometimes incorrect, but not woefully so. It’s more that the API isn’t adhering to the prompt engineering in some cases (which ultimately does break my application). I was intending to use fine-tuning for this purpose, but I thought it would “integrate” with the existing, already-trained language model (in this case, davinci) to help improve responses. I simply don’t have thousands of data points to train upon.
ChatGPT4 seems to be much better at it, but that might be quite some time before I have API access to it.
Are my assumptions correct? And can anyone give me any advice?
Fine-tuning, as the name suggest, fine-tunes existing pre-trained models to improve completions with minimal prompting. Currently only base models ada, babbage, curie and davinci support fine-tuning.
Thanks for the response. I’m confused, as the little fine-tuning I did do seems to produce worse results than without the fine tuning (also at at a higher token cost). I assumed this was because I was starting off with a “vanilla” base and then working off that. It may be that I simply don’t have enough data points to fine-tune on. Still, I would think that it couldn’t get worse from my own additions.
The issue with fine-tuning without have a lot of datapoints is that the effects don’t show cause compared to the original size of the modele, the fine-tuning might be miniscule. Open AI research says that the performance scales when the number of fine-tuning parameters are doubled, so lack of data would really effect the performance, especially if it is based off of base-davinci. You might be better off trying to prompt engineer your task and you the few points you have as samples for the prompts to the llm.