Fine tune failed but no reason given

Attempting to fine tune curie. I submit the .jsonl file and api goes from pending to failed but does not give a reason for the failure. Now I’m not sure what to do – do I keep trying?

‘updated_at’: 1674912661,
‘validation_files’: },
{‘created_at’: 1674913222,
‘fine_tuned_model’: None,
‘hyperparams’: {‘batch_size’: None,
‘learning_rate_multiplier’: None,
‘n_epochs’: 4,
‘prompt_loss_weight’: 0.01},
‘id’: ‘ft-AKZ9a6vX8oSz9gjadD687pFB’,
‘model’: ‘curie’,
‘object’: ‘fine-tune’,
‘organization_id’: ‘org-DjoSJnnXM2aP8lMapNzkRlmz’,
‘result_files’: ,
‘status’: ‘failed’,
‘training_files’: [{‘bytes’: 4116449,
‘created_at’: 1674913222,
‘filename’: ‘file’,
‘id’: ‘file-YJDBWDO71yFPZYurFX9JuxsW’,
‘object’: ‘file’,
‘purpose’: ‘fine-tune’,
‘status’: ‘processed’,
‘status_details’: None}],
‘updated_at’: 1674913270,
‘validation_files’: }],
‘object’: ‘list’}

Did you check your JSONL file for errors? Are the keys correct?

Example

{"prompt": "<prompt text> \n\n###\n\n", "completion": " <ideal generated text> #####"}
{"prompt": "<prompt text> \n\n###\n\n", "completion": " <ideal generated text> #####"}
{"prompt": "<prompt text> \n\n###\n\n", "completion": " <ideal generated text> #####"}

See Also:

Reference:

Preparing your dataset

It’s a large file but I did format it the way you are showing it. Is there an online services somewhere the validates .jsonl?

Not that I know of. All my prior searches harvests only JSON validators.

As you know, these JSON validators will not validate JSONL, but maybe someone else knows of one which works well for JSONL?

Sorry, I have searched before and also today, and came up empty.

Thanks for your help. I’d did a quick visual check of the jsonl and I don’t see any issues. I tried running the fine tune again and same result (it fails but gives no reason for the failure).

Now I don’t know what to do except try a small portion of the jsonl (but this is going to get expensive) if I have to step through the jsonl to find the error…

I did find this: GitHub - bloomreach/JSONL-Validator: A simple utility to validate your JSONL files

I will give that a try later…

Yeah, this is a problem many are experiencing.

  • No JSONL validator specific for OpenAI Fine-Tunings
  • Immature (beta) error messages from the API with are not very helpful.

In my “still working on it” OpenAI Lab app, I was planning on writing my own validator, but it’s further down the dev path timeline. Still working out the kinks in the workflow…

Anyway, I think a basic JSONL validator for Fine Tuning can be accomplished using a basic REGEX.

Maybe something like this on a line by line basis in a loop (not fully tested), or you can alter it as you like (needs tweaking, will work on it later this week):

/^\{"prompt":\s*"([^"]+)",\s*"completion":\s*"([^"]+)"\s*\}$/gm

@aydengray2020

Just a quick check of the REGEX above from the Ruby console:

irb(main):022:0>string='{"prompt":  "Hello", "completion": "World"}'
=> "{\"prompt\":  \"Hello\", \"completion\": \"World\"}"
irb(main):024:0> /^\{"prompt":\s*"([^"]+)",\s*"completion":\s*"([^"]+)"\s*\}$/.match?(string)
=> true

Of course, this REGEX needs more tweaking if you want to account for the details in the link and image above, but it’s a start.

OK @aydengray2020, maybe you can try something like this to get started until we come up with something better.

Here is an example from the Ruby console:

irb(main):089:0> 
irb(main):090:1*     def validate(fine_tune_data)
irb(main):091:2*         if  fine_tune_data.present? 
irb(main):092:2*             count = 0
irb(main):093:3*             fine_tune_data.split("\r\n").each do |line|
irb(main):094:3*                 count = count + 1
irb(main):095:4*                 if /^\{"prompt":\s*"([^"]+)",\s*"completion":\s*"([^"]+)"\s*\}$/.match?(line)
irb(main):096:4*                     puts "LINE ##{count} VALID JSONL: #{line}"   
irb(main):097:4*                 else
irb(main):098:4*                     puts "LINE ##{count} INVALID JSONL: #{line}"
irb(main):099:3*                 end
irb(main):100:2*             end
irb(main):101:2*         else
irb(main):102:2*             return false
irb(main):103:1*         end
irb(main):104:0>     end
=> :validate
irb(main):105:0> string='{"prompt":  "Hello", "completioan": "World"}\n{"prompt":  "Hello", "completion": "World"}'
=> "{\"prompt\":  \"Hello\", \"completioan\": \"World\"}\\n{\"prompt\":  \"Hello\", \"completion\": \"World\"}"
irb(main):106:0>  validate(string)
LINE #1 INVALID JSONL: {"prompt":  "Hello", "completioan": "World"}
LINE #2 VALID JSONL: {"prompt":  "Hello", "completion": "World"}
=> ["{\"prompt\":  \"Hello\", \"completioan\": \"World\"}", "{\"prompt\":  \"Hello\", \"completion\": \"World\"}"]
irb(main):107:0>

Not perfect, but I’m going to use something similar to this when I write my JSONL validation method.

Will test this further later… sure it needs tweaking :wink:

 def validate(fine_tune_data)
        if  fine_tune_data.present? 
            count = 0
            fine_tune_data.split("\r\n").each do |line|
                count = count + 1
                if /^\{"prompt":\s*"([^"]+)",\s*"completion":\s*"([^"]+)"\s*\}$/.match?(line)
                    puts "LINE ##{count}  VALID JSONL: #{line}"   
                else
                    puts "LINE ##{count} INVALID JSONL: #{line}"
                end
            end
        else
            return false
        end
end

Hope this helps.

It turns out I did not have enough credits to complete the fine tuning. I increased my usage limits, resent the jsonl and it processed successfully. I assumed ‘lack of credits’ would be a common issue and would throw a known error, but it doesn’t.

3 Likes

Great to hear you figured it out @aydengray2020

You motivated me to write this validation method, so maybe you might find it useful someday if you have a similar problem with JSONL and fine-tuning.

Thanks for the weekend motivation to bang out some code.

Thanks,

I’ll be doing another jsonl test in the coming week and will use your code. I’ll let you know how it goes.

Ayden

1 Like

The reason for the failed error can be due to the low charge
But in general, you can use the following command to understand the reason for the error
!openai -k “api key” api fine_tunes.follow -i “fine-tune-id”
Direct more questions and I will guide you
insta : @iman.ws

1 Like