OpenAI Developer Community

Fine-tuning jobs failing with "internal error"

rob19 February 3, 2026, 3:08am 1

This problem has returned after temporarily working again last week (see: Internal Error during fine-tuning).

I’m using a fine-tuning process that has worked successfully hundreds of times over the past several years, but now fails. The training files validate and fine tuning begins, but then fails with an internal error, retries twice and then completely fails.

21:41:50

The job failed due to an internal error.

21:23:42

Fine-tuning job started

21:23:35

The job experienced an error while training and failed, it has been re-enqueued for retry.

21:05:19

Fine-tuning job started

21:05:13

The job experienced an error while training and failed, it has been re-enqueued for retry.

20:47:06

Fine-tuning job started

20:47:04

Files validated, moving job to queued state

20:42:23

Validating training file: file-8KrKVpPVysyZbDaVJZPAqT and validation file: file-ECw1XrDu5rKKsuGuLHqtr7

20:42:23

Created fine-tuning job: ftjob-JiuewuY4cBu9lU8Mo663ICTF

OpenAI finetuning continuously failing

API fine tune constantly gives me "We're having trouble accessing your files right now. Please try again later."

tech42 February 3, 2026, 5:16am 2

Experience the same issue
Fine-tuning job(s) fail deterministically at end-of-epoch / end-of-job boundary and get auto “re-enqueued for retry”.

Job A: ftjob-es4MiHzZaAN93vYpJPREfzNI (n_epochs=3, fails right after Step 72/216 = end of epoch 1)
Job B: ftjob-C3sKWB8yvYD3hqu263FTmkVx (n_epochs=1, fails right after Step 70/70 = end of job)
Reproduces across datasets (including previously working) and across base models.

sperez February 3, 2026, 4:05pm 3

I’m seeing the same thing. Dies after epoch 1, retires, keeps dying, eventually fails the job. Re-ran a job that ran fine two days ago and that re-run failed after epoch 1. That tells me this is something on the OpenAI side.

kingrude1 February 3, 2026, 4:21pm 4

getting the same issue for days now: “The job failed due to an internal error” .tried everything.

sperez February 3, 2026, 7:12pm 5

Things have “progressed” to jobs not getting past file validation (on files that used to work fine).

rob19 February 3, 2026, 9:35pm 6

Same here – training files that validated fine in a few minutes before have been spinning for 2 hours now.

sperez February 4, 2026, 2:19am 7

Just now retried a previous job and it validated and got past the first epoch. Looks like someone rebooted the computer.

rob19 February 4, 2026, 1:26pm 8

I have had several jobs succeed now too. It looks like things are working now, but who knows for how long?

Isaiah_C February 7, 2026, 6:25pm 9

Try finetuning via code. That worked for me awhile back

OpenAI_Support February 16, 2026, 4:17pm 10

Hey everyone, Apologies for the inconvenience. We looked into this issue with our Fine tune team and we have fixed this issue now. Can someone please take a look and confirm. Thank you!

Topic		Replies	Views	Activity
OpenAI finetuning continuously failing Bugs fine-tuning	4	172	February 27, 2026
Internal Error during fine-tuning Bugs fine-tuning , gpt-41	10	379	January 30, 2026
Fine-tuning job fails after 3 retries during moderation eval refusals_v3 (internal error, gpt-4.1-mini-2025-04-14) API gpt-4 , fine-tuning , api	11	463	March 16, 2026
The Job Failed Due to an Internal Error \| Fine-tuning gpt4o-mini API fine-tuning	14	1094	January 4, 2025
API fine tune constantly gives me "We're having trouble accessing your files right now. Please try again later." Bugs fine-tuning-problems , files-api	17	397	March 29, 2026