Fine-tuning job stuck in queued state

Hi, it seems to be happening again for me. I’ve been trying to fine-tune a model since Friday, and I’ve triggered different runs, but all of them stay queued for several hours before I cancel them. The current job has ID ftjob-pay4lQOKW7rtTx9vGvOTy82T.

Information

  • I already changed the model from gpt-4.1-mini-2025-04-14 to gpt-4.1-nano-2025-04-14, but I got the same issue.

  • I already canceled jobs that were queued and retried them.

  • It is the only job running in my org.

  • The training file has fewer than 500 examples, and they are all short phrases, so it should not be related to token size.

Yesterday, it started running again. It is still getting stuck in the queue for 6–7 hours before starting the fine-tuning job, even though I am running only one job at a time in my org, but I suppose the bug can be closed.