Is it possible to Fine-Tuning OpenAI model with Reinforcement Learning from Human Feedback ? If yes, how I can do that ?
1 Like
Definitely something I would like to know about as well. From what I can see currently, we’re obliged to re-train the model with the added prompt-completions.