Fine tuning open source models with outputs from OpenAI's reasoning model

bluearrow98 · March 24, 2025, 9:37am

Hello, I have a very task specific use case where I would like to fine tune small open-source models (because I want to self-host them due to privacy reasons) at my company but based on “structured outputs”, that are application specific based on a certain logic defined by us, from OpenAI reasoning models. Before I jump into this, I would like to know, if this is legal?

I am kind of in a dilemma. On one hand, the terms of use (https://openai.com/policies/row-terms-of-use/) says that the input and output of the model is owned by the users. But on the other hand, it says that we cannot use outputs to train competing models.

Could someone help with this?

_j · March 24, 2025, 2:09pm

It is legal. The police won’t be knocking on your door for this. They didn’t for scraping and assembling 45TB of copyrighted data to make GPT-3 which could easily infringe on others’ intellectual property rights with its reproductions.

Then you simply have “terms”, where OpenAI’s recourse is to drop your account. Training competing models can be read as selling a product that fills the same space as OpenAI. A model that is internal and can’t be generalized I don’t think would be that.

Topic		Replies	Views
Fine tuned GPT-4o legalities API fine-tuning	1	259	August 22, 2024
Synthetic instructions generated by OpenAI Community fine-tuning	2	2694	December 24, 2023
Use OpenAI for generate specific task instructions for FT Community gpt-4	1	235	March 24, 2025
Can fine-tuned models be used in our website for users to use? Prompting	0	455	January 16, 2023
Who owns fine tuned models? API	7	4599	January 3, 2024

Fine tuning open source models with outputs from OpenAI's reasoning model

Related topics