Fine tuning open source models with outputs from OpenAI's reasoning model

Hello, I have a very task specific use case where I would like to fine tune small open-source models (because I want to self-host them due to privacy reasons) at my company but based on “structured outputs”, that are application specific based on a certain logic defined by us, from OpenAI reasoning models. Before I jump into this, I would like to know, if this is legal?

I am kind of in a dilemma. On one hand, the terms of use (https://openai.com/policies/row-terms-of-use/) says that the input and output of the model is owned by the users. But on the other hand, it says that we cannot use outputs to train competing models.

Could someone help with this?

It is legal. The police won’t be knocking on your door for this. They didn’t for scraping and assembling 45TB of copyrighted data to make GPT-3 which could easily infringe on others’ intellectual property rights with its reproductions.

Then you simply have “terms”, where OpenAI’s recourse is to drop your account. Training competing models can be read as selling a product that fills the same space as OpenAI. A model that is internal and can’t be generalized I don’t think would be that.