Fine tune model auto complete label not from train list - how to stop this?

That is the way I would do this:
Mind punctuation and delimiters - models love them:

  • “:” - separating label from data contents;
  • “,” - separating labels;
  • “;” - separating data records.

Since you gave me the data only, I don’t know any previous instructions, explanatory prompts, and labeling headers for the model - I had to add the labels by myself.
I am not using completion for training, I inserted the category contents into the prompts so the model can understand as a “structured database”.

Note: The lines had been broken to make it more readable in the code snippet box, the linebreaks are not intended to insert in the code.

# Instructions section
{“prompt”:“This training dataset contains code, description,
category."}
{“prompt”:“Please consider the listed data below for your responses
accordingly."}
{“prompt”:“Do NOT add or remove any code, description, or category
without expressed consent in User prompt."}
...
# Data section
{“prompt”:“code: 12222,
description: retainer for the period 03/01/2023 - 03/31/2023: monthly
branding/core retainer\n\n###\n\n”,
category: TAX;"}

{“prompt”:“code: 12333,
description: baggage al pendant reg hours\n\n###\n\n”,
category: OFFICE;"}

{“prompt”:“code: 12345,
workspace incremental fee: 28,573 pages\n\n###\n\n”,
category: FEE";}
...
category: XYZ".} # period "." at the end of the last record
# - it is advised
...

There are more details such as using the System role strategically in order to add precise instructions for the model to follow during the training as a context-maintenance.

And a structured text as a dataset is also helpful to the model. By the way, please consider a separate dataset file uploaded to the cloud storage of your choice in the case of a large training or operational dataset. Please check this thread about it:
Seeking Advice on Handling Large Vehicle Database for AI Chatbot Application

Try this way, and please let me know the results.