Is documentation for the upcoming reinforcement fine-tuning available?

brew · February 4, 2025, 1:04pm

Reinforcement fine-tuning is currently only available to selected alpha testers and will be made available in early 2025. It’s a complex, compute-intensive, potentially dangerous thing, I understand that.

But is there an argument against already making documentation available on how it will work when it’s released? I am talking about things like how to define graders and what evaluators are available. This would enable developers to already prepare their datasets in anticipation.

John_Freier · April 19, 2025, 4:59pm

So… what happened to this? I was really excited for the ‘reinforcement fine-tuning’ feature and it seemed to just get walled off to mystery alpha testers?

brew · April 19, 2025, 6:26pm

I had a suspicion that I need to cross-check with the original announcement video.

Could it be that the evaluators for RFT are the same als the ones you can define in the evaluations API?

_j · April 20, 2025, 1:12pm

To answer this question that was probably self-answered. Your documentation.

https://platform.openai.com/docs/guides/fine-tuning#preference

There is but one model available.

Fine-tuning has been released per-model before to those who actively have used fine-tunes before.

brew · April 20, 2025, 2:47pm

Hi @_j,

the fine tuning method you are showing is Direct Preference Optimization (DPO), also known as Preference Fine Tuning.

It is not Reinforcement Fine Tuning, which is what my question was about.

I already have access to DPO (AFAIK everyone has)

There is no documentation available for Reinforcement Fine Tuning yet. (At least not for me, as the documentation can change when you are logged in)

Topic		Replies	Views
Fine-tuning updates: Reinforcement fine-tuning now available + GPT-4.1 nano fine-tuning Announcements	12	4470	June 12, 2025
Any idea when gpt-4 fine tuning will be released for everyone? API fine-tuning	15	5102	July 24, 2024
When can we fine-tune gpt-4? API gpt-4 , fine-tuning	3	2996	November 9, 2023
Which fine tuning algorithms are offered? API fine-tuning , openai-documentation	0	75	April 16, 2025
Will we be allowed to fine-tune o1 models in the future? API fine-tuning , o1 , o1-preview	2	2135	October 6, 2024

Is documentation for the upcoming reinforcement fine-tuning available?

Related topics