Is it time for a GPT-3 Training Data Refresh?

Not sure if it’s been discussed before either on Slack or in the forums, but I was wondering if there’s any interest at all in having GPT-3 retrained on data from 2021?

Personally, I don’t think most GPT-3 apps would benefit from the model being updated, but again, maybe there are apps I don’t know about that could use GPT-3 with an updated model of reality.

I talk about this idea more here and somewhere in the middle of my podcast episode:
https://bakztfuture.substack.com/p/new-podcast-episode-is-it-time-for

… but really, I’m interested to hear feedback on what the community thinks. Maybe we could propose this to OpenAI if there is enough interest.

3 Likes

Hi there, as written in our documentation:

“We plan to add more continuous training in the future.”

6 Likes

I’m aware of this, but in the meantime is there any plan/interest to update the data until continuous training arrives. When was the last time the training data was updated at all? It was trained fall of 2019

1 Like

When more information is available on our plans for updating the data, we’ll be sure to update the documentation :smiley:

3 Likes

When you get to the point of releasing model updates, please allow users to pin to specific versions. We may spend months refining our prompts only to find they don’t perform as well on a newer model, which would make building a product on this API very difficult.

yes prompt regressions are an actual risk in this case

Knowledge is not purely additive with neural nets (or humans, really). More information = more knowledge, yes, but also more misinformation, nuance, contradiction, confusion, etc. And until we have quadrillion+ parameter models, these networks involves some lossy compression. New knowledge may distort or erase older knowledge.

If you’ve designed a prompt on today’s model, you’ve done so based on the extent and limits of its knowledge as well as its unique configuration of weights. Your happy path through the network today may not be so happy after retraining. We actually observed this earlier in the beta. OpenAI released a refreshed model which I personally found was better for some of my prompts. But because other customers reported regressed results, they yanked it and it has yet to reappear. I’m sure they learned from that incident and have a plan to deal with versioning.

1 Like

One example is related to what I’m doing here at wiserize. We use GPT-3 to generate product descriptions. If the model is not aware of the product, there is no way it can generate a factual response.

So at least for factual answering, it will require a model being continuously generated.

1 Like

Imagine you’re summarizing emails into bullet points. Getting GPT-3 to perform a task accurately and consistently is an art. So you’ve formulated the perfect n-shot examples and tweaked the parameters until you’ve settled on something that is suitable for commercialization. Maybe you’ve even figured out a way to get it to work with curie or babbage. When a re-trained model comes along, there’s no guarantee that your perfectly designed prompt will still work as well as it once did, or even at all with the same model selection and parameters.

1 Like

I’m definitely not arguing against retrained models. My uses of GPT-3 could also benefit from more current knowledge. But I do feel they need to provide a reasonable deprecation period on older versions to allow for testing and adjustment of productionized prompts.

2 Likes

On that note, it’s fascinating how GPT-3’s brain is pre-pandemic.

I suppose you could database-cache answers you want to be fully deterministic, whereas you might want some flexibility in others precisely to react to new prompts. Imagine, for example, an email needs to be summarized where a lot of the content is about the coronavirus, virus mutants, quarantines, and home schooled kids. To continue on this example, I’ve been told by a native speaker even the meaning of “home schooling” slightly changed since the pandemic (now often meaning “distance learning”).

But yeah, it would be cool to be able to target specific time frames in the model. It might also help with combating racist or sexist answers. E.g. maybe I don’t want 1950s wisdom on gender :slightly_smiling_face:

It would be interesting to see that. The training information would have to be classified first to establish the epoch ranges. :open_mouth:

Hi all,
I am new to GPT-3, initially I thought its a data refresh issue, when I ask:
Q: who is the president of USA?
A: The current president of the United States is Donald Trump.

but once it responded as times the answer is: Joe Biden

Is this a data refresh issue or some other setting issue. Appreciate advise.

Thanks,
FZS

You need to use a database in conjunction with GPT3 if you want to ensure accuracy. It does not guarantee accuracy out of the box

Hello,
Does anyone know if they are waiting to figure out a system to continuously refresh their training data. Or if they plan to retrain it on new data before that ?
I’m asking this for the sole reason of ChatGPT using old libraries, ECTs.

That system is “pretrain a new model on a massive amount of data”.

This is an old topic. You can compare the start date of this topic to see you are joining a topic at a point in time even before text-davinci-001.

Current models are not just pretrained, a process taking months of computation, but also have a massive investment in tuning, all elaborating on the way that base model works, in order to turn it into a safe and intelligent product. It is thus not simply “train another AI and release it for experimentation onto a small niche of developers” as it might have been over two years ago.