Does it make sense to evaluate the performance of an LLM in a 'Pre-Training' Stage?

mittalh944 · April 20, 2024, 6:31am

Hi there!
I’m in the ‘pre-training’ stage where I’m training a language model (it’s very tiny based on a character level).

I’m looking to get a hands on experience with the engineering side while evaluating model performance.

I looked around & found that the evaluation metrics depends on the task that model was trained on.

As per my understanding, pre-training stage is an unsupervised learning with no clear objective - rather just to make the model to get used to the language, words & structure.

Question

Does it make sense to evaluate the model’s performance in the pre-training stage?
Does it only make sense to evaluate the model’s performance that’s fine-tuned for a specific task?

Would appreciate a response & anything else you’d like to add on my understanding or better path I can take to learn the engineering side of LLMs.

Cheers!

Foxalabs · April 20, 2024, 6:48am

Depends if the model is aligned at the current stage of training, alignment typically reduces performance, if you look at the signs of AGI paper from Microsoft, the GPT-4 model prior to full alignment performed at a significantly higher rate than after.

vb · April 20, 2024, 7:17am

I think you somewhat already answered your question by writing it down.
Everything you do should have a clear objective. Otherwise how are you going to assess the progress made?
If you define the goal of the unsupervised learning step to be ‘get used to the language, words & structure’ then you want to see that this actually happened.

Topic		Replies	Views
Best practice to retrain the already trained model using fine tuning API fine-tuning	0	971	May 14, 2024
Finetuning Noob : Guidelines and Best Practices? API chatgpt , fine-tuning	1	2490	September 30, 2023
LLM and Prompt Evaluation Frameworks Prompting prompt-engineering , prompting , evals	11	2704	December 16, 2024
Evaluating LLM Chat Responses without Evaluation Dataset? API gpt-4 , assistants-api	2	409	June 14, 2024
How do I know if my fine-tuned model is actually better than the base model? (For MATH-related use cases) API plugin-development , playground	0	448	April 17, 2024

Does it make sense to evaluate the performance of an LLM in a 'Pre-Training' Stage?

Related topics