Fine Tuning Issue with training file

This is my training.jsonl file

{"prompt": "code to create an S3 bucket with s3 block public access disabled", "completion": "resource \"aws_s3_bucket\" \"aws_s3_bucket\" {\n bucket = \"example\"\n } resource \"aws_s3_bucket_public_access_block\" \"public_access\" {\n bucket = aws_s3_bucket.public_access.id block_public_acls = false\n block_public_policy = false\n ignore_public_acls = false\n restrict_public_buckets = false\n }\n"}
{"prompt": "code to create an S3 bucket with s3 block public access enabled", "completion": "resource \"aws_s3_bucket\" \"aws_s3_bucket\" {\n bucket = \"example\"\n } resource \"aws_s3_bucket_public_access_block\" \"public_access\" {\n bucket = aws_s3_bucket.public_access.id block_public_acls = true\n block_public_policy = true\n ignore_public_acls = true\n restrict_public_buckets = true\n }\n"}
{"prompt": "code to create an S3 bucket with s3 block public access enabled", "completion": "resource \"aws_s3_bucket\" \"aws_s3_bucket\" {\n bucket = \"example\"\n } resource \"aws_s3_bucket_public_access_block\" \"public_access\" {\n bucket = aws_s3_bucket.public_access.id block_public_acls = true\n block_public_policy = true\n ignore_public_acls = true\n restrict_public_buckets = true\n }\n"}
{"prompt": "code to create an S3 bucket with s3 block public access enabled", "completion": "resource \"aws_s3_bucket\" \"aws_s3_bucket\" {\n bucket = \"example\"\n } resource \"aws_s3_bucket_public_access_block\" \"public_access\" {\n bucket = aws_s3_bucket.public_access.id block_public_acls = true\n block_public_policy = true\n ignore_public_acls = true\n restrict_public_buckets = true\n }\n"}
{"prompt": "code to create an S3 bucket with s3 block public access enabled", "completion": "resource \"aws_s3_bucket\" \"aws_s3_bucket\" {\n bucket = \"example\"\n } resource \"aws_s3_bucket_public_access_block\" \"public_access\" {\n bucket = aws_s3_bucket.public_access.id block_public_acls = true\n block_public_policy = true\n ignore_public_acls = true\n restrict_public_buckets = true\n }\n"}
{"prompt": "code to create an S3 bucket with s3 block public access enabled", "completion": "resource \"aws_s3_bucket\" \"aws_s3_bucket\" {\n bucket = \"example\"\n } resource \"aws_s3_bucket_public_access_block\" \"public_access\" {\n bucket = aws_s3_bucket.public_access.id block_public_acls = true\n block_public_policy = true\n ignore_public_acls = true\n restrict_public_buckets = true\n }\n"}
{"prompt": "code to create an S3 bucket with s3 block public access enabled", "completion": "resource \"aws_s3_bucket\" \"aws_s3_bucket\" {\n bucket = \"example\"\n } resource \"aws_s3_bucket_public_access_block\" \"public_access\" {\n bucket = aws_s3_bucket.public_access.id block_public_acls = true\n block_public_policy = true\n ignore_public_acls = true\n restrict_public_buckets = true\n }\n"}
{"prompt": "code to create an S3 bucket with s3 block public access enabled", "completion": "resource \"aws_s3_bucket\" \"aws_s3_bucket\" {\n bucket = \"example\"\n } resource \"aws_s3_bucket_public_access_block\" \"public_access\" {\n bucket = aws_s3_bucket.public_access.id block_public_acls = true\n block_public_policy = true\n ignore_public_acls = true\n restrict_public_buckets = true\n }\n"}
{"prompt": "code to create an S3 bucket with s3 block public access enabled", "completion": "resource \"aws_s3_bucket\" \"aws_s3_bucket\" {\n bucket = \"example\"\n } resource \"aws_s3_bucket_public_access_block\" \"public_access\" {\n bucket = aws_s3_bucket.public_access.id block_public_acls = true\n block_public_policy = true\n ignore_public_acls = true\n restrict_public_buckets = true\n }\n"}
{"prompt": "code to create an S3 bucket with s3 block public access enabled", "completion": "resource \"aws_s3_bucket\" \"aws_s3_bucket\" {\n bucket = \"example\"\n } resource \"aws_s3_bucket_public_access_block\" \"public_access\" {\n bucket = aws_s3_bucket.public_access.id block_public_acls = true\n block_public_policy = true\n ignore_public_acls = true\n restrict_public_buckets = true\n }\n"}

After running fine_tuning code I’m getting this error -

The job failed due to an invalid training file. Invalid file format. Input file file-EfnrFcUzKBJTZMpM0dImasui is in the prompt-completion format, but the specified model gpt-3.5-turbo-0125 is a chat model and requires chat-formatted data

using this fine tune code -

client.fine_tuning.jobs.create(
  training_file="file-xyz",
  model="gpt-3.5-turbo"
)

What’s wrong going in here? Anyone can help me out please?

I cannot see anything wrong based on the example provided

{"prompt": "<prompt text>", "completion": "<ideal generated text>"}
{"prompt": "<prompt text>", "completion": "<ideal generated text>"}
{"prompt": "<prompt text>", "completion": "<ideal generated text>"}

Maybe it’s an issue with your escapes. Could you try with a more basic completion value?

I converted my training data to below format -

{"messages": [{"role": "user", "content": "<prompt>"}, {"role": "assistant", "content": "<content_message>"}]}

Then used model=“gpt-3.5-turbo” to fine tune the data and worked successfully.

Is it like model=“gpt-3.5-turbo” doesn’t accepts previous format data?

Welcome to the dev forum @tarique_salat

The prompt-completion pair format is meant for models on the legacy completion endpoint.

This model is a chat completion model and thus it should use the relevant fine-tuning format where every line is a JSON list named messages containing the messages you want to fine-tune on.

1 Like