Fine-tuning updates: Reinforcement fine-tuning now available + GPT-4.1 nano fine-tuning

Karan_Sharma · May 8, 2025, 5:33pm

Reinforcement fine-tuning is now available for o4-mini! You might remember we announced the alpha program for RFT during 12 Days of OpenAI last December. We’ve been working on it since, and verified organizations can get started with it today. This marks the first time you fine-tune on OpenAI reasoning models—RFT is a new technique that uses chain-of-thought reasoning and task-specific grading to improve model performance for your specific domains. One of our alpha program members, Accordance, used RFT and saw a 40% increase in model performance for their tax and accounting purposes. We’re also offering a 50% discount if you share your datasets with us, which can help improve future OpenAI models. Get started with our reinforcement fine-tuning guide: https://platform.openai.com/docs/guides/reinforcement-fine-tuning

And as an update to “normal” fine-tuning (supervised), we’ve added the ability to fine-tune our fastest, cheapest model, GPT-4.1 nano.

OnceAndTwice · May 8, 2025, 6:14pm

OpenAI might not be pregnant, but never fails to deliver.

I hope someday that features like this, and reasoning summaries in the API, can be granted to folks like me who don’t want to provide government ID. This all sounds so cool but the risk is a smidge too high.

And thanks for allowing fine-tuning on 4.1-nano! It’s a pretty weak model so this will be a very big help. Many thanks for your hard work.

David_Zhu · May 9, 2025, 7:26am

great news. Thanks Karan.

yuri_eliakim-jb · May 9, 2025, 10:00am

Any chance you can expand support of dpo finetuning to 4.1 and other models?

mustafaakben · May 10, 2025, 9:16am

@Karan_Sharma,

I opened another thread in the feedback API, but I just saw this post. Sorry for double post. First, very excited to explore RL tuning. Great work on that.

I have two questions:

Can we use SF fine-tuned models as a grader in the RL pipeline? I have some nice SF fine-tuned models that are good at in-distribution but bad at out-of-distribution. I assume that if I use them in the RL pipeline as a grader, maybe RL can learn from them and generalize to out-of-distribution too? Any idea on that? Does it make sense?
Second, I like the idea that we can run a custom Python grader. However, I have some deep learning models that can be used like a grader. I am running these models on a REST API with Azure. However, your RL pipeline does not give network connectivity. Is there any chance that you can provide network connectivity for graders to post to REST APIs? It also reduces the resource demand on your side. Or can you add a REST API grader option so that we can run graders from our server and send responses to the RL pipeline? Any thoughts on that?

Looking forward to hearing from you, and thanks!

lu-wo · May 10, 2025, 3:57pm

Very nice – been looking forward to trying this out

Huy_Fe · May 14, 2025, 12:47am

@Karan_Sharma As, “FT updates… gpt-4.1-nano now available” to everyone, i don’t have access to this, and i need to fine-tune this model and work with. How can i get access to?

Looking forward to hearing from you.

Regards

Samad_Koita · May 30, 2025, 5:48pm

@Karan_Sharma Is it possible to use Reinforcement fine-tuning for Multi-turn RL? I should be able to provide a list of messages (trajectory of the agent, which may include tool calls) and the reward should be based on the final result of the agent.
Would be great if we could have a cookbook for this.

Karan_Sharma · June 6, 2025, 10:54pm

Yes, you should be able to use supervised fine-tuned models as a Grader for RFT! However, we don’t yet support connecting to graders outside of our stack – we will consider that for our future roadmap!

Karan_Sharma · June 6, 2025, 10:54pm

DPO on the 4.1-series is now available!

Karan_Sharma · June 6, 2025, 10:55pm

not yet, but this is something we are actively considering!

Abdallah_Bashir · June 12, 2025, 5:54pm

@Karan_Sharma will RFT be available as well?

Topic		Replies	Views
Fine-tuning using GPT-4 Beta API	18	25983	December 12, 2023
Fine-tuning Announcement & Davinci Beta Announcements	7	1828	December 20, 2021
GPT-3.5 Turbo fine-tuning now available (and new GPT3 models) API announcement , fine-tuning , api	18	16349	December 15, 2023
Any idea when gpt-4 fine tuning will be released for everyone? API fine-tuning	15	5138	July 24, 2024
Fine tuning chat models -- coming soon? API	5	1428	December 15, 2023

Fine-tuning updates: Reinforcement fine-tuning now available + GPT-4.1 nano fine-tuning

Related topics