Fine-tuning job fails after 3 retries during moderation eval refusals_v3 (internal error, gpt-4.1-mini-2025-04-14)

Hi,

I’m encountering an issue where a fine-tuning job completes training successfully but ultimately fails during the moderation evaluation phase after three retry attempts due to an internal error.

Job Information

  • Job ID: ftjob-OkPixpS21QjyQWCsIEJflUJb

  • Training Method: Supervised

  • Base Model: gpt-4.1-mini-2025-04-14


Timeline from Logs(TZ: Asia/Seoul)

15:31:07  Created fine-tuning job
15:31:07  Validating training and validation files
15:31:14  Files validated, moving job to queued state
15:31:17  Fine-tuning job started

16:18:34  Checkpoint created at step 873
16:18:34  Checkpoint created at step 1746
16:18:34  New fine-tuned model created
16:18:34  Evaluating model against our usage policies

16:29:37  Retrying moderation eval refusals_v3 (attempt 2/3) due to an internal error.
16:59:38  Retrying moderation eval refusals_v3 (attempt 3/3) due to an internal error.

After the third retry attempt, the job status becomes failed.


Observations

  • The training phase completes successfully.

  • A fine-tuned model is created before the moderation evaluation step.

  • The failure occurs specifically during the refusals_v3 moderation evaluation.

  • The error message indicates an “internal error”, not a policy violation.


Questions

  1. Is this a known issue with moderation evaluation on gpt-4.1-mini-2025-04-14?

  2. Does this indicate a problem in my training dataset format or content?

  3. Is there a way to debug or bypass this moderation evaluation failure?

  4. Should I retry the job, or is there a known mitigation?

Any insight would be greatly appreciated.

Thank you!

6 Likes

The AI model should still say “no” and produce refusals for the tests OpenAI runs.

Train with a very specific, application-based system prompt that is an unlikely “trigger” in a different space. Train with less overfitting (a lower learning-rate multiplier or fewer epochs).

Then you have the dumb dilemma that your training file can still be refused by initial scans if you include reinforcement like “I’m sorry, but I cannot…” in examples of bad inputs to a “You are ChatGPT” system prompt, and try to increase refusals by training on refusable inputs.

Your model gives tax advice? Maybe it should say, “I cannot assist with that request” if someone wants to use it to fill their ex-boyfriend’s car with concrete…

If truly an “internal error”, there is not much you can do except wait for service to be restored.

2 Likes

I’m currently having the exact same issue when fine-tuning the same model - gpt-4.1-mini-2025-04-14.

I have tried using various different training data sets - all of which are completely different from one another.

It all strongly points to this being an internal issue that is unrelated to the actual policy compliance of the output model. I need this solved as soon as possible.

2 Likes

I’m experiencing the same issue when training/fitting the gpt-4.1-mini-2025-04-14 model. I’ve already changed the dataset several times (they are completely different sets) and the problem persists. Based on the evidence, it doesn’t appear to be an output policy issue but rather an internal bug.

I need this addressed with the highest priority.

2 Likes

Experiencing the same issue despite very benign content safe data with gpt-4.1-2025-04-14 . This needs to be fixed ASAP!

2 Likes

I’m seeing the exact same thing. Started in the last 24 hours.

1 Like