What does the learning rate of 2 or 5 or 10 mean (different from 2e-5 or 1e-4), in fine-tuning?

shuklaeshita0209 · January 23, 2025, 7:22am

Hello!

I am EXTREMELY new to LLMs, so this might be a very stupid question, and if that’s the case, I apologize in advance.

I have been reading some articles and papers that mention the leaning rate of the following types: 2e-5 or 1e-4.

Why does the learning rate hyper-parameter for fine-tuning GPT3.5 look so different? Am I missing something?

Thank you so much in advance!

platypus · January 23, 2025, 7:57am

Hi @shuklaeshita0209 and welcome to the community!

Most likely reason is that it’s a scaling parameter, rather than a direct LR that is sent to the model training.

lior.boord · January 23, 2025, 8:27am

It’s not a stupid question at all.

LLMs have huge numbers as mentioned above by @platypus. The reason for these very small LR numbers are:

LLMs Are Huge: With billions of parameters, even tiny changes can have a big effect. A small learning rate avoids instability.

Preserve Pre-trained Knowledge: Fine-tuning tweaks the model for a new task without overwriting what it already knows. Small steps make this possible.

shuklaeshita0209 · January 27, 2025, 10:28pm

Hello!

Okay that makes. Thank you for taking the time to answer!

shuklaeshita0209 · January 27, 2025, 10:29pm

Okay, I will try a few different LR and figure out what works better for my dataset (will try smaller values). Thank you so much!

Topic		Replies	Views
Finetuning -- hyperparameter conversions, learning_rate_multiplier values Deprecations fine-tuning , davinci , fine-tuning-problems	3	3903	December 3, 2023
Adjusting Finetuning Hyper-parameters For Small Datasets API fine-tuning , api	0	547	April 6, 2024
Hyper-Parameter Fine-tuning Guide API api	4	2382	April 2, 2024
Is Fine-tuning a technical Fine-tuning? API fine-tuning	1	61	January 15, 2025
Learning Rate Multiplier Values API	3	1851	March 28, 2024

What does the learning rate of 2 or 5 or 10 mean (different from 2e-5 or 1e-4), in fine-tuning?

Related topics