Can I use fine tuned model without system role prompt for my specific use case?


I have finetuned my model for my specific use case for identifying intent and entities in a user query. Here is my sample training data:

“messages”: [
“role”: “system”,
“content”: “You’ll receive user input. Your task is to classify each query into predefined intents and extract associated entities if applicable. Provide your output in JSON format, detailing the ‘intent’ and associated ‘entities’. Intents include: MOVIE_TICKETS, CINEMAS. Entities associated with each intent are as follows: MOVIE_TICKETS: [‘movie_name’], CINEMAS: [‘cinema_name’]”
“role”: “user”,
“content”: “Book movie ticket at PVR”
“role”: “assistant”,
“content”: “{intent: CINEMAS, entities: {‘cinema_name’: ‘PVR’}”

Since, my system prompt is very large, always sending this while requesting fine tuned model for results will be costly as tokens passed to the model will be more. Querying without system prompt leads to generic results which are not ideal for my use case. Is it possible to request fine tuned model without giving system prompt and only providing user query?
I understand Fine-tuning improves on few-shot learning but querying always with sytem prompt defeats the purpose of sending less tokens to the model.

1 Like

Welcome to the forum.

You can try fine-tuning with the system message and without and compare, but it won’t likely work as well. You’ll want to keep the system messages the same in the training data and your production prompt too.


Hi @PaulBellow ,

I wondered if you could expand on this a little further.

It seems that with fine tuning, we are promised a land of less tokens during inference time. However, if we must use the system prompt at inference to gain the best performance, where is the token gain?

Is it possible that because we “bake in” our task during fine tuning, we could possibly keep the system message fundamentals and lose the details during inference?

For example, if my FT model is supposed to output formatted JSON from some user inputted text:

  • During fine tuning we would include the phrase “You convert x to JSON…” as well as the schema
  • Then during inference we would only include “You convert x to JSON…” and leave out the schema

Would that work?
If you are not sure I am going to be trying this out anyway so I’ll report back here for others who may be curious!


I have a question brother, if I define context in the system while fine tuning, will it work if ask question and answer from given fine tuned context?