The adaptability of fine-tuned LLMs in responding to prompts that contain new types of information not covered in the training data

vkutalbusiness · December 22, 2023, 12:32pm

Hello everyone,

I have been working on my bachelor’s thesis, which involves fine-tuning a GPT-3.5 Turbo model to do a specific task.

After fine-tuning the model, I plan to present it with prompts that include not only information from the training data but also new types of information not covered in the training data.

Specifically, I’m curious about the following:

Dependency on Training Data: How dependent are the model’s responses on the structure and content of the training data? Can the model effectively handle prompts that introduce new elements for example system information, even if it is not included in the training data?
Flexibility in Prompt Usage: If my training data didn’t include specific system information, can I still introduce it in my prompts when using the fine-tuned model?
Handling XY Cases and Providing Hints via Prompts: Is it possible to guide a fine-tuned language model during TASK YZ? If so, how can this be achieved by merely modifying the prompt?

Best,
Volkan

TonyAIChamp · December 22, 2023, 1:52pm

Hi Volkan

Fine-tuning just adds another layer of “neurons” to the model, without significantly changing already existing weights.

So the fine-tuned model will react to the new information just like a non-fine-tuned one. Unless you tune it to react on a nw information in a specific way.

vkutalbusiness · December 22, 2023, 4:33pm

Hello TonyAIChamp,

Thank you for your response. I understand your point about how a fine-tuned model reacts to new information similarly to a non-fine-tuned one, unless it’s specifically tuned to respond to new types of information in a certain way.

However, I need clarification on your statement: “Fine-tuning just adds another layer of ‘neurons’ to the model.” From my understanding, a neural network comprises layers and neurons, including input, hidden, and one output layer. Your answer suggests that we slightly tweak and change the weights. But when you mention “adding another layer,” does this really imply the actual addition of new layers including neurons?

Could you elaborate on what happens to the output layer during this process? I would greatly appreciate it if you could provide academic papers or sources that offer a deeper understanding of what occurs during the fine-tuning of ChatGPT.

A detailed, scholarly explanation in my thesis is essential. I am truly thankful for your response, as it guides me on areas where I need to delve deeper. I’ve already searched the forum and found this thread, but it doesn’t quite satisfy the academic depth I am seeking.

Best regards,
Volkan

TonyAIChamp · December 23, 2023, 1:39am

Hi Volkan

"Fine-tuning just adds another layer of ‘neurons’ to the model.” - this was just a metaphar. As I understand, what fine-tuning does, is it changes some of the weights of the model (but in most cases not very significantly).

But there are people on the forum who are WAY more technical in this question that me, so let’s wait for more comments

TonyAIChamp · December 23, 2023, 1:39am

Btw, this link may be helpful in your research: LLM Visualization

Topic		Replies	Views
What exactly and technically happens with fine-tuning? API	10	4085	January 3, 2024
Prompt Usage for Fine-Tuned Models Community gpt-35-turbo , fine-tuning	1	1029	January 4, 2024
Fine tuning - how exactly does it work? API	6	1735	December 23, 2023
What does fine-tuning do? API fine-tuning	5	1044	February 7, 2024
Finetuning for shortening prompts Documentation fine-tuning	10	2676	December 24, 2023

The adaptability of fine-tuned LLMs in responding to prompts that contain new types of information not covered in the training data

Related Topics