GPT3 Finetuning for Multilabel Classification

devbydylan · July 24, 2023, 6:19pm

It definitely makes sense, and I VERY much appreciate the offer. But for something like reviewing thousands of items of feedback on a product, for example, I think a one-prompt approach is the way we would like to go, then refining how it processes that data. This is a digital product, so some of the labels could be “UI, Data, Performance”, certainly all not mutually exclusive.

@curt.kennedy had an interesting approach to using separate models for each label, but you’d be quick to hit your TPM limit since you’re sending each piece of the thousands of items through a dozen individual models. So I question the viability of that approach as well. This is a POC, so I don’t care if we run all the results through davinci, or GPT 3.5 turbo, but accuracy and consistency is paramount at this stage.

shabnamhasani73 · August 2, 2023, 7:09pm

@gharezlak Could you please if possible, share your code snippet with us? I’m struggling with how to use what has been discussed here for finetuning multi-label classification. I don’t want to train different models per label.

curt.kennedy · August 3, 2023, 12:53am

You can also group labels by model. So put 5-10 categories on each model.

If you understand how the physics of your cell phone works, when the SNR is low, they transmit less bits per second (smaller model choices per model), and the SNR is high, they transmit more bits per second (more choices per model).

A binary choice maps to BPSK (2 states) and a multi-class model maps to N-QAM, or N states.

So … if your model performs really well at accurately distinguishing 256 states, then stick with that. If it has trouble, you lower the states (and increase SNR) until you are satisfied with the model performance, and maximize throughput, and minimize throttling.

shabnamhasani73 · August 8, 2023, 2:00pm

Hi @curt.kennedy,
I have a question having followed this thread and appreciate it if you could respond. My task is multilabel classification, given your example, I don’t have only one label as completion. Does it work like this? {“prompt”: “Your company is awesome.\n\n###\n\n”, “completion”: " 1,0,1,0,1"}
In the reference notebook, I don’t know how to change the line for fine-tuning [!openai api fine_tunes.create -t “sport2_prepared_train.jsonl” -v “sport2_prepared_valid.jsonl” --compute_classification_metrics --classification_positive_class " baseball" -m ada] since it acts like binary classification.

PaulBellow · October 17, 2024, 8:24pm

3 posts were split to a new topic: Please outline the steps to perform multi-label classification using GPT3

PaulBellow · October 17, 2024, 8:25pm

A post was merged into an existing topic: Please outline the steps to perform multi-label classification using GPT3

Topic		Replies	Views
Please outline the steps to perform multi-label classification using GPT3 API	5	284	November 13, 2024
Issues with Fine-Tuned Babbage-002 Model Returning Incorrect Completions Prompting gpt-4 , chatgpt	13	1828	December 21, 2023
Do you fine tune? If so why? API	34	4659	December 25, 2023
Fine tuning a model for customer service for our specific app Prompting	23	14580	May 14, 2024
Using the new fine-tunes endpoint for binary classification API fine-tuning , python	10	2228	January 11, 2024

GPT3 Finetuning for Multilabel Classification

Related topics