Fine tuning fail on gpt-4o-mini-2024-07-18

wpstream · December 26, 2024, 2:01pm

Hi,
I try to fine tune the gpt-4o-mini-2024-07-18 model but i constantly get “The job failed due to an internal error.” without other details. What is strange is that the same file used for fine tuning gpt-4o-mini-2024-07-18 works fine on gpt3.5 turbo and gpt-4o-2024-08-06 . Is there any special requirements for gpt-4o-mini-2024-07-18 ? I use the check code from cookbook chat_finetuning_data_prep and everything seems fine in my file.
Any ideas will be appreciated.
Thank you

bence.lukacsy · December 27, 2024, 8:00pm

Having the same issue. I’m reading other threads on the topic and there doesn’t seem to be any consensus on a solution. I’m still trying to figure out if this is an error on our end or OpenAI’s. Will keep requeuing jobs until it works. Chat support not helpful.

wpstream · December 28, 2024, 9:25am

I was thinking that is something wrong with my data but … : the same file with 450+ jsonl lines worked for fine tuning yesterday and failed today (on gpt-4o-mini-2024-07-1 ) . So i’,m starting to think it may an OpenAi problem .

Did you had any success with doing the same job multiple times ?

bence.lukacsy · December 28, 2024, 8:38pm

I split my data into subsets and fine-tuned them individually. The ones that failed had one thing in common: more than 1 example was at the 2048-item-per-example limit. So I tried to fine-tune the failing subsets at 1024 max messages per example instead of 2048, and this worked.

I am currently training my entire dataset with this in mind; the file validation takes a long time, so I will report back if it works. Perhaps there is some extra overhead or the actual limit is less than 2048.

bence.lukacsy · December 29, 2024, 12:21pm

Update: still failing due to an internal error. I’m out of ideas

vb · December 29, 2024, 12:30pm

This issue has been forwarded to OpenAI.
Thanks for flagging!

samberk · January 2, 2025, 9:32pm

Hi,
I have the same error, any solution?

john.allard · January 2, 2025, 10:44pm

Hello everyone (@samberk , @bence.lukacsy , @wpstream ),

Apologies for the delayed response here. This was not caused by any errors on your end (bad datasets, hyperparameters, etc), this was caused by an internal error on our end that slipped through our monitoring system and went mostly unnoticed through late December.

We believe we have pushed a fix as of 1:48pm PST today (Jan 2nd). Please try rerunning your jobs.

Happy New Year!

samberk · January 2, 2025, 11:32pm

Thanks @john.allard

Looks like it’s working now.

caoyixuan1993 · January 7, 2025, 10:44am

@john.allard @vb I still encounter the above error.
I try to finetune the gpt-4o-mini-2024-07-18 model but i constantly get “The job failed due to an internal error.” without other details.

renatacatelli15 · January 17, 2025, 2:50pm

Hi, I am currently facing the same issue, is there a solution? Has anyone else encountered this issue again?

bence.lukacsy · January 19, 2025, 5:08pm

Returned to this after a short break but am still unfortunately facing internal error (gpt-4o-mini-2024-07-18):

(Edit): I have found a workaround by limiting the number of messages per example to a lower amount like 512. Ideally the max (2048) should be used to reduce the amount of context lost, but it just doesn’t seem to work. I tried various increments all the way down to 600 and they all fail. 512 works for me, but I do lose a lot of context (since I’m cutting examples up more frequently).

vijay9 · March 25, 2025, 8:56pm

Do we have the same problem with gpt-4o-2024-08-06 fine-tuning?
The fine-tune job failed twice with The job failed due to an internal error
15:43:31: The job failed due to an internal error.

15:20:16 : Fine-tuning job started

15:20:10 : The job experienced an error while training and failed, it has been re-enqueued for retry.

15:02:41 : Fine-tuning job started

15:02:32 : The job experienced an error while training and failed, it has been re-enqueued for retry.

14:13:06 : Fine-tuning job started

14:12:53 : Files validated, moving job to queued state

Topic		Replies	Views
The Job Failed Due to an Internal Error \| Fine-tuning gpt4o-mini API fine-tuning	14	791	January 4, 2025
Chatgpt 4o-mini fine-tuning fails.Internal error API chatgpt	8	516	January 4, 2025
"The job experienced an error while training and failed, it has been re-enqueued for retry." API fine-tuning-problems	5	189	January 20, 2025
Fine Tuning, job failed due to an internal error API fine-tuning-problems	3	820	January 20, 2025
Error During Fine-Tuning: "Model not available for fine-tuning or does not exist API fine-tune , gpt-4o-mini	7	605	August 13, 2024

Fine tuning fail on gpt-4o-mini-2024-07-18

Related topics