Unable to submit a new davinci fine-tune job

Hey all, I’m trying to train a fine-tune using davinci. I’ve done this before with smaller models, but this time around it’s not working. I prepared a dataset using the tool and it generated a jsonl. When I submit it, it sits for a minute or two and I ultimately get this:

Upload progress: 100%|██████████████████████████████████████████████████████████████| 471M/471M [00:00<00:00, 1.27Tit/s]
Error: Error communicating with OpenAI

my command is as follows:

“openai --api-key [api key redacted] api fine_tunes.create -t dataset4_prepared.jsonl -m davinci”

And I’m using the linux terminal on Windows 10. Any idea what might be wrong?

The “key redacted” part of the message makes me think you may have used an old API key

If you are on the free plan and have not put in a credit card yet, fine tuning is also limited to $15

I generated a new one from the webUI, in case that was the problem, and it didn’t help, unfortunately. I do have a credit card linked and raised the account limit of $350, so I don’t think it’s that either.

Agreed.

I don’t think it’s a money problem when the error from the API is:

It’s hard to debug because you need to log your openai cli process messages verbosely.

I don’t use the CLI (and that is one of the reasons).

Is there a log file which the CLI generates? Can you set the logging to be more verbose?

What does it show when you list all your fine-tune models?

Normally, when you list your models, there are status messages there in the object for each tuning task.

See Also:

1 Like

Hi @AndreInfante

From the description available it seems that the streaming of progress is terminated after the file upload completes due to the error.

@ruby_coder is correct.

You should call the list fine-tunes endpoint and see the “status” for each job.

1 Like

@AndreInfante, also in addition to the list API method, you can also try the Retrieve fine-tune API endpoint if you know your fine-tune id.

@sps, thanks for staying active here and being a part of the “signal” which keeps the signal-to-noise ratio a bit higher here.

The SNR is so low these days, you are one of the few posters here who post developer “signal” without the noise, the “hand waving” or the “cheerleading” (or “complaining”) noise we see here a lot, always posting referenced technical-facts in the spirit of a true coder / software engineer / developer.

Thank you!

:+1:

We really need to increase the SNR here, as the “signal” for OpenAI API coders is getting lost in the noise. Your posts @sps are much appreciated especially since they do not have a commercial “looking for business” undertone, which is really nice to see here.

:slight_smile:

1 Like

Thanks for the kind words @ruby_coder

Learned a lot from knowledgeable folks in my early days here. Just trying to pay it forward.

I appreciate reading your posts and tutorials as well.

1 Like

Hey folks, thank you for the help!

I used api files.list and got a list of my past fine-tunes, but this most recent job doesn’t appear in the list, so I can’t get a status. I also can’t look it up based on fine-tune id, because the process fails before it reports an id. :frowning:

I found the verbose mode and re-ran the command in verbose mode (which I should have thought of!). Here’s a more detailed error message:

[2023-02-18 10:25:27,749] message='Request to OpenAI API' method=get path=https://api.openai.com/v1/files/dataset4_prepared.jsonl
[2023-02-18 10:25:28,086] message='OpenAI API response' path=https://api.openai.com/v1/files/dataset4_prepared.jsonl processing_ms=120 response_code=404
[2023-02-18 10:25:28,086] error_code=None error_message='No such File object: dataset4_prepared.jsonl' error_param=id error_type=invalid_request_error message='OpenAI API error received' stream_error=False
[2023-02-18 10:25:31,978] message='Request to OpenAI API' method=get path=https://api.openai.com/v1/files
[2023-02-18 10:25:32,578] message='OpenAI API response' path=https://api.openai.com/v1/files processing_ms=553 response_code=200

Seems like it’s failing to get the file from openai, but it’s doing this before it says it uploaded the file in the first place, so I’m puzzled.

2 Likes

From some googling it seems like utf-8 encoding is a problem, but the proposed hot fix didn’t work for me, so I tried forcibly converting all the text to ASCII (it’s mostly ASCII anyway) before generating the dataset, and that reduced the number of errors. Now it’s just this error remaining:

[2023-02-18 11:26:46,732] error_code=None error_message='No such File object: dataset4_prepared.jsonl' error_param=id error_type=invalid_request_error message='OpenAI API error received' stream_error=False

I seem to have resolved the last error by splitting the file into three equal pieces. It looks like my text dataset (roughly 500 megs) is simply too large for the cli uploader, so I’m going to upload and fine-tune on one piece at a time to work around the problem.

1 Like

Can you share the output after calling list Files endpoint?