Fine-tuning strategy, full parameter tuning or LoRA?

What is the inner fine-tuning strategy used for fine-tuning api, full parameter tuning or LoRA tuning? I’m recent doing experiments using fine-tuning api and this matters a lot for my research. :pray:

You can read about PPO, however, OpenAI won’t comment on their technology officially.

1 Like