Public Moderation API Passed, But Fine-Tuning File Still Rejected – Why?

noumandilawar · August 27, 2025, 1:07pm

Hi everyone,

I’m trying to fine-tune a model and keep running into this error:

The job failed due to an unsafe training file. 
This training file was blocked by our moderation system because it contains too many examples that violate OpenAI's usage policies, or because it attempts to create model outputs that violate OpenAI's usage policies.

The strange part is that when I tested all my training samples locally using the Moderation API, none of them were flagged.
Here’s the code I used to check every line of my dataset:

import json
from openai import OpenAI

client = OpenAI()

input_file = "training_data.jsonl"
approved_file = "approved.jsonl"
rejected_file = "rejected.jsonl"

import json
from openai import OpenAI

# Initialize OpenAI client
client = OpenAI(api_key="your_api_key_here")

input_file = "poa-train-fine-tuning-data-2.0.0.jsonl"
output_file = "approved_output.jsonl"

approved = []
rejected = []

with open(input_file, "r", encoding="utf-8") as infile:
    for line_num, line in enumerate(infile, start=1):
        try:
            sample = json.loads(line)

            # Concatenate all message contents (prompt + completion style)
            parts = []
            for msg in sample.get("messages", []):
                content = msg.get("content", "")
                if isinstance(content, str):
                    parts.append(content)
                elif isinstance(content, list):  # handle structured content
                    for c in content:
                        if isinstance(c, dict) and "text" in c:
                            parts.append(c["text"])
            full_text = " ".join(parts)

            # Run moderation
            response = client.moderations.create(
                model="omni-moderation-latest",  # or "text-moderation-latest"
                input=full_text
            )

            result = response.results[0]  # SDK returns list-like results

            if result.flagged:
                rejected.append({
                    "line": line_num,
                    "categories": result.categories,
                    "text": full_text[:300] + "..."  # preview for debugging
                })
                print(f"❌ Rejected line: {line_num}")
            else:
                approved.append(sample)
                # print(f"✅ Approved line: {line_num}")

        except Exception as e:
            print(f"⚠️ Error on line {line_num}: {e}")

# Save approved data
with open(output_file, "w", encoding="utf-8") as outfile:
    for item in approved:
        outfile.write(json.dumps(item, ensure_ascii=False) + "\n")

print(f"\n✅ Approved samples: {len(approved)}")
print(f"❌ Rejected samples: {len(rejected)}")

if rejected:
    print("\nRejected preview:")
    for r in rejected[:5]:
        print(f"Line {r['line']} - Categories: {r['categories']}")

After running this, almost all my data went into approved.jsonl — hardly anything was flagged.

But when I upload the same dataset for fine-tuning, the job fails with the unsafe training file error.

My questions are:

Is the moderation pipeline used during fine-tuning stricter than the public Moderation API?
Is there a hidden threshold (e.g., cumulative risk across all samples) that causes rejection?
Has anyone else faced this issue and how did you resolve it?

Thanks in advance!

zhengyang.han · August 29, 2025, 1:27am

Same situation. The check of training file seems to be overly strict. I don’t know if OpenAI update their policies these days, I have not ran into that much of safety check error before

Topic		Replies	Views
Moderation endpoint not sufficient to avoid blocked training files API fine-tuning , moderation	6	748	October 4, 2024
Fine Tuning API Inconsistently Blocking Jobs Bugs fine-tuning , moderation , content-filters , fine-tuning-problems	0	79	July 1, 2025
Moderation fail/strange API	1	793	December 18, 2023
Usage policy violations with fine-tuned model - how can I avoid this? API question , gpt-35-turbo , fine-tuning , fine-tuning-problems	2	1338	October 20, 2023
Fine-tuning blocked: “Unsafe training file” Bugs fine-tuning , fine-tuning-problems , gpt-41 , gpt-41-mini	1	75	October 17, 2025

Public Moderation API Passed, But Fine-Tuning File Still Rejected – Why?

Related topics