Fine-tuning strategy, full parameter tuning or LoRA?

You can read about PPO, however, OpenAI won’t comment on their technology officially.

1 Like