When composing a fine-tuning dataset, would it be effective to use only keywords?

steve33 · November 28, 2023, 8:32am

Hi, guys. I have a question.

When composing a dataset for fine-tuning, can it recognize the user’s content using only keywords without expressing the entire question?

Let me give an example.

When wanting to train on user prompt data like ‘What is the singularity of artificial intelligence?’, using just the keyword ‘Artificial intelligence singularity’ in the user’s content.

Full

{“messages”:[{“role”:“user”,“content”:“What is the singularity of artificial intelligence?”}
,{“role”:“assistant”,“content”:"The singularity of artificial intelligence is… "}]}

Simplified

{“messages”:[{“role”:“user”,“content”:“Artificial intelligence singularity”},
{“role”:“assistant”,“content”:"The singularity of artificial intelligence is… "}]}

Can using such simplified data with only keywords for training still enable the model to respond as accurately as if it were trained with the full data?"

anon22939549 · November 28, 2023, 8:52am

Unless you or your users are only ever going to interact with the model using just keywords, this is likely a terrible idea.

You would essentially be trying to train out of the model its ability to ignore irrelevant and extraneous information.

steve33 · November 29, 2023, 1:03am

I thought it could be handled since I knew the first chunk was processed separately, but it seems that’s not the case. However, I will give it a try. Thank you!

steve33 · February 29, 2024, 4:21am

I thought it could be handled since I knew the first chunk was processed separately, but it seems that’s not the case. However, I will give it a try. Thank you!

Topic		Replies	Views
Fine tuning using a corpus API api	8	2052	July 13, 2023
Fine-Tuning with Non-Prompt/Completion Data: Seeking Advice for Direct Text-Based Training? API gpt-4 , chatgpt , fine-tuning , api	3	428	August 23, 2024
Can I fine tune without specifying an answer through the "assistant" role? API	6	1270	December 25, 2023
Can I use fine tuned model without system role prompt for my specific use case? API gpt-35-turbo , chatgpt , fine-tuning , api	3	1370	May 7, 2024
How to choose my fine tuning data? API fine-tuning , fine-tuning-problems	6	1128	January 2, 2024

When composing a fine-tuning dataset, would it be effective to use only keywords?

Related topics