I’m having a lot of trouble uploading a fine-tune file. Originally I was starting in Python. My code looks like this:
openai.api_key = os.environ.get('OPENAI_API_KEY')
file_meta_data = openai.File.create(
file = open('processed.jsonl', 'rb'),
purpose='fine-tune'
)
But all I get back is this error even though I don’t have Python or any other terminal running?
sys:1: ResourceWarning: unclosed <ssl.SSLSocket fd=6, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=0, laddr=('192.168.1.101', 49970), raddr=('52.152.96.252', 443)>
So, I tried cURL instead. Here is my POST request:
curl https://api.openai.com/v1/files \
-H "Authorization: Bearer sk-*****" \
-F purpose="fine-tune" \
-F file='processed.jsonl
But the request 400s with this:
{
"error": {
"message": "The browser (or proxy) sent a request that this server could not understand.",
"type": "server_error",
"param": null,
"code": null
}
}
Can anyone explain what I could be doing wrong? For context, the processed.jsonl file I am referencing is entirely filled with lines like this example:
{"prompt": "yep i have tried laptop too several times over the past week and again today i have tried different browsers too \n\n###\n\n", "completion": "it is working ok from here miriam does this link help END"}
Have you tried the CLI?
openai tools fine_tunes.prepare_data -f processed.jsonl
I had not. I used the CLI and it did help format my file even better, but unfortunately I am still experiencing the same issue (same error in Python and from cURL)
You can create an entire fine-tune from the CLI. So if the Python version isn’t working, I would just do it in the CLI until you can figure the Python out.
I’m sorry, I don’t understand your meaning. Are you saying I can upload my file with the CLI? If so, how so?
Yep,
openai api fine_tunes.create -t <TRAIN_FILE_ID_OR_PATH> -m <BASE_MODEL>
Just follow the guide here. The CLI is your friend when various bindings fail 
Ah. Okay I’ll try that. I was under the impression I needed to load my file before creating the fine-tun job. I’d originally been searching for that train file id and that’s what I saw was in the response of the upload file api. That’s when I resorted to using the path instead, but then ran into the aforementioned problems.
1 Like