What are the limits of fine tuning?

prescod · March 25, 2023, 4:50pm

As an example: if I uploaded a billion prompt/completion pairs of jpeg data and descriptions to OpenAI’s API, could I teach GPT-3.5 to describe images?

Could I teach GPT-3.5 to speak a previously unknown human language?

Or is the fine-tuning mechanism intrinsically limited? How should we think about the limits of fine tuning?

wfhbrian · March 25, 2023, 5:16pm

Interesting question. Since GPT-4 is multimodal, I wonder if this could be how they did it.

curt.kennedy · March 25, 2023, 6:17pm

@prescod

First off, you can only fine-tune base GPT-3 models … so the original Ada, Babbage, Curie, and Davinci. You cannot fine tune davinci-003 or gpt-3.5-turbo (at least as of this writing, but they are adding features all the time, so this may change in the future).

Second, the model you are using is based on language, and it understands language. So my concern is that feeding in random tokens that are from am image would look like noise to the language model. And your fine-tune is only affecting the decoder, not its language understanding. Since there is no coherence of language in an image, the signal you are trying to train is going to be lost by the assumed language understanding.

However, and this is a big however, the same technology is used image and audio domains. So there is “hope”, right? I’m afraid the decoder training won’t be enough, and the language model details would have to be retrained on the new input media, but hey, you can give it a shot and let us know how it goes! I wouldn’t hold my breath, but if you are successful, I would be very interested!

prescod · March 26, 2023, 1:20am

Thanks for the detailed answer.

I’m definitely not going to spend thousands of dollars running the experiment on images.

I was just hoping that there was some way to predict from first principles what can be achieved in fine-tuning. I don’t want to empty my bank account on experiments.

Can you share a reference which would help me understand this sentence? “And your fine-tune is only affecting the decoder, not its language understanding.”

curt.kennedy · March 26, 2023, 1:54am

It’s really based on speculation on my part, but I am not the only one who thinks this. And others say a fine tune affects ALL parameters in GPT-3, but I have a hard time to believe my file-tune file creates a new file with 175 billion parameters (if I fine-tune DaVinci). More on these conjectures HERE!

So, conjecture aside, I do give you a chance at being successful, only because, I think, there is a small chance I am wrong.

Topic		Replies	Views
Fine tuning - how exactly does it work? API	6	2434	December 23, 2023
What exactly and technically happens with fine-tuning? API	10	5500	January 3, 2024
Fine-tuning only available for 'base models'? API	6	1404	December 23, 2023
What does fine-tuning do? API fine-tuning	5	1597	February 7, 2024
Fine tuning using a corpus API api	8	1868	July 13, 2023

What are the limits of fine tuning?

Related topics