So I have a problem where each time I try to continue fine-tuning a fine-tuned model, I get this error:
openai.InternalServerError: Error code: 500 - {'error': {'message': 'The server had an error processing your request. Sorry about that! You can retry your request, or contact us through our help center at help.openai.com if you keep seeing this error. (Please include the request ID req_3f38abf2fe9c46e77719c87601a9a205 in your email.)', 'type': 'server_error', 'param': None, 'code': None}}
Here is my code:
def fine_tune_model(client, dataset_path, base_model, suffix, validation_path=None, epoch=None):
"""
Fine-tune a model on a specified dataset.
Args:
dataset_path (str): Path to the JSONL file with training data.
base_model (str): The base model to fine-tune.
suffix (str): The suffix to identify the fine-tuned model.
Returns:
str: The ID of the fine-tuned model.
"""
file_id=None
validation_file_id = None
try:
if validation_path:
print(f"Uploading validation set: {validation_path}")
validation_file_response = client.files.create(
validation_file=open(validation_path, "rb"),
purpose='fine-tune'
)
validation_file_id = validation_file_response.id
print(f"Uploaded validation file ID: {validation_file_id}")
if dataset_path:
# Upload file
print(f"Uploading dataset: {dataset_path}")
file_response = client.files.create(
file=open(dataset_path, "rb"),
purpose='fine-tune'
)
file_id = file_response.id
print(f"Uploaded file ID: {file_id}")
except Exception as e:
print(f"File already uploaded: {client.files.list()}")
file_id = "file-lsGKVlcUiXmOXVRCHyldSA6c"
validation_file_id = "file-XKiNi57PIX3Z3ybhUHIOfPAd"
# Create fine-tuning job
print(f"Creating fine-tune job...\n training_file: {file_id}, validation_file: {validation_file_id}")
fine_tune_job_response = client.fine_tuning.jobs.create(
training_file=file_id,
validation_file=validation_file_id if validation_file_id else None,
model=base_model,
suffix=suffix,
hyperparameters={
"n_epochs": epoch if epoch else 3
}
)
fine_tune_id = fine_tune_job_response.id
print(f"Fine-tune job created with ID: {fine_tune_id}")
return fine_tune_id
base_model = "ft:gpt-4o-2024-08-06:personal:fine-tuned-4:AUg7D0Dl"
# base_model = "gpt-3.5-turbo-0125"
suffix1 = "hello_fine_tune"
suffix2 = "iteration_5"
fine_tune_id2 = fine_tune_model(client, training_file_path, base_model, suffix2, validation_path=validation_file_path, epoch=3)
Why am I getting this error? The file format is the same as the jsonl that is passed to initially train the model. Did anyone find a solution to this?
First, why would you have some hard-coded file IDs as an error case? Without a minimum of using the files endpoint to retrieve their object and purpose? That just gives you more forgotten reason for the code to fail or train with the wrong data. There seems to be a misunderstanding - every upload gets a new file ID and will not produce a “file exists” error, and you want that new blob ID so you are ensured the JSONL with a fix edited in is employed.
Try 20 lines of THE ORIGINAL JSONL that worked in fine-tuning. See the success in a test on that when specifying your fine-tune model name starting with ft:.
If that works, then we must conclude the issue is in the new input JSONL or the odd code branches.
This isn’t true. I see the exception raised for files that have already been uploaded. I have verified that the files themselves are not the issue. Is there some parameter I am not passing?
The only parameter that needs to change is the model name. You can reuse the same JSON as would originally have trained a model, to weight the fine-tuning deeper or to impart a known skill across models.
Here is a post with a script for fine-tune, also polling until success, needing only the JSONL file by its local name, and with other parameters up at the top (not attempting a validation file).
And just to show that you can upload the same filename, and even the identical file, multiple times:
I’m seeing the same issue. Seems to have started on Sunday, as fine-tuning fine-tuned models worked fine on Saturday. There’s a vague error in the web interface, and the same error message from the API.
I’ve been encountered with the same problem. Tried to copy a former successful fine-tune project and failed. So that definitely has nothing to do with file format or other settings.