I’m currently custom training for the “gpt-3.5-turbo-1106” model, focusing on insurance claim text data to predict Case Type, Case Category, and Case Sub-category.
Each Case Type, Case Category and Case Sub-category forms one Combination. Likewise I have 248 such combinations.
So, my input text falls into any one combination out of these 248 combinations.
- Initially, I trained the model on a limited dataset consisting of 4 combinations only, each comprising 45 records, totaling 180 records.
The achieved accuracy was 85%.
- Subsequently, I expanded the training scope to include data from 27 combinations, incorporating a total of 1215 records (45 records per combination).
However, the accuracy dropped significantly and it is only 45%.
My strategy includes :
preprocessing and cleansing the input text, alongside the integration of system prompts to guide the model. I introduced 27 combination names into the system prompt and conducted training over 10 epochs.
json_response = '{"Case Type": "' + case_type + '", "Case Category": "' + case_category + '","Case Sub Category": "' + sub_category + '"}'
fine_tuning_data.append({
"messages": [
{"role": "system", "content": f"You are a helpful assistant. Your task is to classify the text by the user into one of the following pre-defined Combinations. Each key in combinations dictionary is one combination: {combinations}"},
{"role": "user", "content": row['text']},
{"role": "assistant", "content": json_response}
]
})
My primary concerns are as follows:
-
Achieving high accuracy remains a challenge with a larger set of combinations.
-
Despite success with fewer combinations, maintaining accuracy with an expanded dataset, as intended for future iterative retraining, poses a significant hurdle.
3)I aim to surpass an accuracy threshold of 80%, particularly for larger combinations.
- However, given the current circumstances, I’m uncertain whether this goal is attainable or if a trade-off between accuracy and dataset size is inevitable.
I welcome any insights or suggestions to overcome these challenges and enhance the model’s performance across diverse combinations.