How to fine tune QA model with context using gpt-3.5-turbo

Hi everyone,

I am new to fine-tuning and just started learning Python. I have a set of questions and answers with context which I would like to train but not sure how to format the data.

I tried reading all the topics related to fine-tuning but couldn’t find anything that would help me include context with the question and answer.

I am not sure if this is the correct way. Can someone please have a look and let me know if this is correct and if not what should I do?

{"messages": [{"role": "system", "content": "You're an assistant that extracts actionable phrases from the given context"}, 
{"role": "user", "content": "What would help you have the best work experience at Acme"}, 
{"role": "user", "content": "Thanks, Dan. perhaps more flexibility"}, 
{"role": "assistant", "content": "more flexibility"}]}

Thanks in advance. Appreciate it!

In the prompt itself, and also within your question, I might shy away from using “context”, because it could mean a half-dozen different things (depending on the context)

I see that you have two user roles. That is not necessary, you can combine them in software. Then you get the demonstrated system/user input/assistant output.

Also, the system prompt could be more differentiated than just how one would enter that same thing into the API to use a normal chatbot.

system:
You are MegaMaster, and only serve to discover and output phrases in conversational feedback that offer suggestion or improvement.
user:
qa input:
What would help you have the best work experience at Acme
Thanks, Dan. perhaps more flexibility

(and with sufficient training on the task, the idea is that just the identity would be needed, as the many examples could show what is going on. Even just “extract actionable item”)

However this example doesn’t give a great feel on its own for what the AI has done.

{“messages”: [{“role”: “system”, “content”: “You’re an assistant that extracts actionable phrases from the given context”},
{“role”: “user”, “content”: “What would help you have the best work experience at Acme”},
{“role”: “user”, “content”: “Thanks, Dan. perhaps more flexibility”},
{“role”: “assistant”, “content”: “more flexibility”}]}

If it could be tagged, task comprehension would be improved:

Interviewer: What would help you have the best work experience at Acme
Feedback input: Thanks, Dan. perhaps more flexibility

It seems you understand how to use roles, otherwise. In the training file, each conversation should go in one line of the file, without line breaks (and with \n for line breaks in strings).

Finally: If the job can be done with four times as much prompting, it would still be half the cost to run using a long prompt instead of a fine-tune model.

2 Likes

Hi _J,

Thank you so much for taking the time to respond to my question. Much appreciated :heart:

Are we talking about something like this?

{"messages": [{"role": "system", "content": "You are MegaMaster, and only serve to discover and output phrases in conversational feedback that offer suggestions or improvement."}, {"role": "user", "content": 'Question: What would help you have the best work experience at Acme? Answer: Thanks, Dan. perhaps more flexibility'}, {"role": "assistant", "content": "more flexibility"}]}

I tried prompting but the precision was around 60-70% so that’s why thinking of fine-tuning.

There is no real “precision” that the AI can measure. If you give it examples of what 43% and 52% are, it might put something in between, though.

1 Like

yeah, I am aware of it. I compared it with the actionable phrases I had manually extracted and it didn’t do well. Either the phrases were missing or it extracted something that was not actionable.

Calculated the precision manually as well :slight_smile:

Do you think I can use the above data format?

Sure, that would be a good case for fine-tuning, where lots of examples are needed to see what “entity extraction of actionable items” would mean if the prompting is not working, and the formatting seems correct and informative if you also replicate that input when you are using the model.

You might also train for the many cases where there is nothing to report with a universal “no action items found”.

I usually think of use-cases where the examples for fine-tune cannot fully explain the rules that are in operation.

2 Likes

Awesome!
I really appreciate your suggestions.
Will definitely include cases where there are no actionable items.

Thanks again for your time and support :purple_heart:

I’m pleasantly surprised at the prompt responses from the community members in this forum. So nice :blush: