Fine-tuning Announcement & Davinci Beta

Hi everyone!

As you may have seen, we publicly announced fine-tuning today: Customizing GPT-3 for Your Application.

A couple of updates:

  • Now that we’ve introduced pricing for training, we are lifting the file and training limits. You are free to use any file size and train as much as needed!
  • As of today, we’re offering fine-tuned Davinci as a beta feature.

We’re excited to see what you build!

8 Likes

Hi @luke
From the article, I understood that it is possible to do incremental fine-tunes, is that correct? So after tuning a model, we can upload a file and train again “adding” to that already tuned model?
Or did I get it wrong?

You can use an existing dataset of virtually any shape and size, or incrementally add data based on user feedback. With fine-tuning, one API customer was able to increase correct outputs from 83% to 95%. By adding new data from their product each week, another reduced error rates by 50%.

1 Like

Hi there, incremental fine-tuning is not currently available but we’re exploring adding it in the new year. The customer referenced trained new models with new data.

2 Likes

Hi @luke
When will davinci intruct be available for fine tuning?

1 Like

We’re exploring this, what use case do you have in mind? Or, put another way, why would you prefer this to fine-tuning the base Davinci model?

2 Likes

Not the OP, but I can think of one good reason:
to train the model in a specific API/Library (or even a custom made one where it has no knowledge)

probably would be also helpful to have it trained in specific code generation in order to save on the prompt size. In case of code, even a few-shot prompt can take up a lot of tokens quite fast
Edit: sorry, I misunderstood, thought it was about davinci-codex

1 Like

DaVinci-fine-tune performs better than Curie-fine-tune on Accuracy. DaVinci is superior to Curie. Since DaVinci-Instruct is superior to DaVinci, wouldn’t DaVinci-instruct-fine-tune perform better than just DaVinci-fine-tune?

2 Likes

Hi,
We tested both a fine tuned Davinci base model and the Davinci instruct, and the results from the latter were better on our data. However, we need it to answer questions from our own corpus only and in the voice of our product. This is why we thought of fine-tuning the Davinci Instruct. Is there any other way to achieve this?

1 Like