I’ve been working on a regression task in my research, and wanted to see if I could figure out a way to force GPT-3 to perform a regression task, and I think I found a pretty descent way. My particular task was to assign “difficulty” to text in a bio class. So GPT-3 was trying to predict the number of students who correctly answer a question .
Here’s the general process I followed:
- Scale your data to a min of 0 and max of 1. You may want to pad yourself if you expect you may get future values outside of the training data range, because these are hard maximums and minimums.
- Set N threshold between 0 and 1 that split the data into chunks. I used .9,.8.,.7,.6, and .5 because I didn’t have many low values.
- Create N copies of the training data.
- For each copy of the data, loop through that copy of the data. If the regression value for observation j is > Ni, then assign it to the category “high.” If the regression value for observation j is <Ni, assign it to the category “low.”
- Using the standard practices for categorization tasks in the fine-tuning, format the data.
- Fine tune on the training data.
- Feed your validation/test data through the model.
- Extract the log odds for “Low”
- Convert to probabilities. Since there should in practice only be two potentially predicted words “High” or “Low”, this is trivial.
- Un-scale your data by multiplying by the max value.
This worked surprisingly well for me. I got an R^2 of .75 on training data and .5 on test data.