Internal Server Error or Network Error when trying to fine-tune a model

I’m having errors when trying to create certain fine-tune jobs. This didn’t happen for my other fine-tune jobs (with other data), and won’t happen if I run another job (with other data) right now.
I’m trying to fine-tune gpt-4o-mini-2024-07-18, without tool calls. This seems irrelevant to the model as it also happens if I select gpt-4o-2024-08-06.

On HTTP API, it says:

{
	"name": "InternalServerError",
	"message": "Error code: 500 - {'error': {'message': 'The server had an error processing your request. Sorry about that! You can retry your request, or contact us through our help center at help.openai.com if you keep seeing this error. (Please include the request ID req_d2bd33bc5d3ec538b29c08f2d93a33d4 in your email.)', 'type': 'server_error', 'param': None, 'code': None}}",
	"stack": "---------------------------------------------------------------------------
InternalServerError                       Traceback (most recent call last)
Cell In[75], line 1
...
   1049     retries_taken=options.get_max_retries(self.max_retries) - retries,
   1050 )

(call stack hidden as they don’t matter)

When trying to perform the same fine-tuning task on WebUI/Playground, the following error (as a red banner) shows up after clicking “Submit”:

Error creating job: NetworkError when attempting to fetch resource.

2024-09-14_00-33-35

The files are in the storage, and the IDs are copied from there.
I have used the data check script and did not find any errors; all data sizes are also far from limit.

Any hints on why this could happen?

(I saw a forum post titled “API Error code: 500 - fine tuned model” [791933] with potentially similar issue. But a) I am not sure if it’s the same; b) I’m using a different setting – pure chat completion without tool call.)

Welcome to the community!

Sounds like it is something on OpenAI’s end. I’d reach out to help.openai.com with the request ID.

Please let us know if you get it worked out.

Good luck!

1 Like

Hi @PaulBellow ,
Thanks for the suggestion. I have reported there just now.
Will post updates here if any progress are made.
Will also watch this thread if anyone knows any hints.

2 Likes

A personal update: it seems all fine-tune request now will encounter the said problem, including those succeeded previously (if recreated). So the problem becomes bigger now.
No new fine-tune jobs were really performed in the meantime, as all my new attempts failed.

Also not received any reply for the support request (through the chat box on help.openai.com), despite it said they usually return in 3 days. (Anyone has any experience with the reply speed?)

1 Like

Ok, I found the problem. It’s a silly problem in fact, but is not documented anywhere.

The problem is because I have used too long a suffix string for the fine-tune job.

The Web UI did notify about permitted characters, but never said anything about length.

1 Like

Thanks for coming back to let us know.

Hopefully someone will catch this.

Again, we appreciate you reporting…

1 Like