Help with fine-tuning for text categorization

nikola1jankovic · March 4, 2023, 12:30pm

Hi everyone,

this will be a specific post, but my intention is not only to improve my results, but to post something from which others might learn. The results I’ve got were poor, but I am not just writing off fine-tuning as a process, I am trying to learn and improve.

My intention: Categorize articles to one-word category after posting them to website.

What I did: Use different cheaper GPT models to test how it will work. Started with ada, used babbage and curie after that. Ada and Babbage were trained on 4 epochs, while curie was trained on 16 epochs. I prepared 100 different real-life articles (subject, intro and first 200 characters of full text - all joined together) and used them as prompts. I put “Categorize this text” in front of the text and this was my prompt in jsonl. Completion was just name of the category, ie. “World”, “Retail”.

What was the result? Well, pretty useless actually. All models were returning some random text, like comments on the articles provided. Curie sometimes put the right category, but it was just part of the full text which did not make much sense. I might have done something wrong and will try to fix it, but I guess some help would be welcome - and it might help others as well.

What I am thinking at the moment?

Number of prompts/completions was probably too low. I will try to increase them.
Should I use different prompt? Would it help if I put “Categorize following text into one of these categories (World, retail, marketing): article_text” in the prompt?
Should I change completion? Should I write: “Category: Category_name”, instead of just name of the category?

wfhbrian · March 4, 2023, 5:32pm

Thanks for sharing your experience.

Have you considered using the embeddings API for classification?

nikola1jankovic · March 5, 2023, 6:47pm

Yes, I will probably need to go in that direction. Not sure about fine-tuning, I am sure it is not supposed to work this way? Why make it available in that case?

wfhbrian · March 5, 2023, 7:05pm

Fine-tuning can be very useful for the right use case.

However, for a simple classification, it just makes more sense to use a simpler solution than fine-tuning.

Topic		Replies	Views
Force GPT 3.5 Turbo to choose an answer from a set of predefined options API	5	441	June 7, 2024
Finetuning experiments (How not to finetune) API fine-tuning	2	728	December 21, 2023
Using the new fine-tunes endpoint for binary classification API fine-tuning , python	10	2162	January 11, 2024
Use OpenAPI for supervised classification task API	4	1898	March 26, 2023
Fine-tuning GPT-3 for email classification: seeking advice to improve accuracy API fine-tuning	2	2292	July 13, 2023

Help with fine-tuning for text categorization

Related topics