From Microsoft Research,
From Google Research,
Combining the method for memory-efficient reinforcement learning with the discovery that AI feedback is as effective as human feedback for reinforcement learning could pave the way, eventually, for OpenAI and others to provide reinforcement learning as an option while fine-tuning models.