Advice needed for JEL code prediction fine-tuning task

GarauGarau · May 8, 2023, 5:50pm

Hi,

I’m currently working on a project where I’m trying to predict JEL codes (Journal of Economic Literature classification codes) based on a list of keywords. I’m using OpenAI’s language models for fine-tuning, specifically Babbage, Carl, and Davinci.

However, I’ve noticed that the results of each model can vary significantly, with surprising results such as Davinci being the worst performer. I’m seeking advice on the design of the prompt and associated completions to optimize the performance of the language model.

I’m considering two possible prompt formats and would like to know which one would be the most suitable for my task.

Prompt format 1:

Keyword 1: Australian dollar Keyword 2: Common currency Keyword 3: Monetary policy Keyword 4: New Zealand

Prompt format 2:

Keywords: DSGE model; Bayesian estimation; Time-varying risk premia; Monetary policy

Result:

If anyone has any advice on which prompt format would be more effective for JEL code prediction and why, I would greatly appreciate it. Additionally, if anyone has any resources that could help me better understand and study fine-tuning language models for this type of task, it would be very helpful.

Thank you in advance for your help

PaulBellow · May 8, 2023, 6:39pm

Welcome to our dev community!

Have you tried just using a two-shot with one of the newer models like GPT-3.5? You might not even need to fine-tune if you just start each prompt with two or three examples.

If you want to continue down the fine-tuning path, I’ll let someone with more experience step up. I fine-tuned a couple years ago right after it was made available, but the results and cost had me back to using the newer, better models.

Something like classification should be able to be done with a good prompt and 2 or 3 examples…maybe?

GarauGarau · May 9, 2023, 6:37am

Thank you for your response and welcome.

Currently I can only use gpt 3.5 and 4 via chatGPT and not with my API (even from playground I don’t see these models). Am I doing something wrong?

In any case my problem is more leagated to the fact that I have a classification problem both multi label and a multi class and I was looking for some information on how to structure prompts and completions just on cases similar to mine.

Topic		Replies	Views
Issues with Fine-Tuned Babbage-002 Model Returning Incorrect Completions Prompting gpt-4 , chatgpt	13	1770	December 21, 2023
Help with fine-tuning for text categorization API	4	1293	December 16, 2023
Use OpenAPI for supervised classification task API	4	1877	March 26, 2023
Fine tuning Davinci01 or prompting Davinci03 API	3	708	December 31, 2022
GPT3 Finetuning for Multilabel Classification API	26	9266	October 17, 2024

Advice needed for JEL code prediction fine-tuning task

Related topics