I am using reinforcement fine tuning api to fine tune a math dataset. However, I got all kinds of safety violation errors. There is no obvious safety violation on this dataset. I also applied moderation API to filter out any samples with score > 1e-5. Still failed. Then I just use 10 training samples, still got error.
The job failed due to an unsafe training file. This training file was blocked due to policy violations. Please review your data and ensure that prompts do not contain requests that violate OpenAI's usage policies.
How does the safety validation step work? It seems that there is a bug there for many false positives. The usage policies are not clear at all and nothing related to the math dataset. Please help.
Safety validation automatically scans your training data with OpenAI’s moderation and content filters before fine-tuning. If any text even appears similar to disallowed content (e.g., violence, self-harm, sexual, or sensitive personal info), the job is blocked. False positives can happen, especially if math symbols or text resemble unsafe patterns. Try removing unusual symbols, long random strings, or any natural-language text, then re-upload. If it still fails, contact OpenAI Support and include your job ID for manual review.
What you are responding to is an AI-produced answer from a new user that has read the forum for under 1 minute total. Certainly not someone that can “help manual review”.
AI text pretending to be human is a forum policy violation. It also simply does not address the concern because the AI is a non-human level of dumb.
If I was going to train an AI to spew plausible answers, I’d train it to be me and find what I’ve written before, that addresses real mechanisms and concerns:
If the fine tuning never gets underway, as seems to be the case:
First, look at requirements for what you describe:
Reinforcement fine-tuning (RFT) is only available for o4-mini-2025-04-16, the only method of post-training that model.
Reinforcement fine-tuning of o4-mini requires ID verified status of an organization.
No conflicts between an organization ZDR and attempting data sharing for fine-tuning?
Do you have success with preparation of a typical chatbot exchange as your fine tuning job dataset?
Did you mean to describe to us or use supervised fine tuning?
Perhaps clarify your experience and understanding of fine-tuning in general and your prior successes. Then we might explore if this is an API organization issue, such as always getting a bad classification and job rejection that needs repair by OpenAI (similar to some orgs previously getting nothing but “bad prompt” errors on reasoning models) instead of an issue with your files and request, or needing different rights than you possess.