Not sure if it’s been discussed before either on Slack or in the forums, but I was wondering if there’s any interest at all in having GPT-3 retrained on data from 2021?
Personally, I don’t think most GPT-3 apps would benefit from the model being updated, but again, maybe there are apps I don’t know about that could use GPT-3 with an updated model of reality.
I talk about this idea more here and somewhere in the middle of my podcast episode:
… but really, I’m interested to hear feedback on what the community thinks. Maybe we could propose this to OpenAI if there is enough interest.
Hi there, as written in our documentation:
“We plan to add more continuous training in the future.”
I’m aware of this, but in the meantime is there any plan/interest to update the data until continuous training arrives. When was the last time the training data was updated at all? It was trained fall of 2019
When more information is available on our plans for updating the data, we’ll be sure to update the documentation
When you get to the point of releasing model updates, please allow users to pin to specific versions. We may spend months refining our prompts only to find they don’t perform as well on a newer model, which would make building a product on this API very difficult.
yes prompt regressions are an actual risk in this case
Knowledge is not purely additive with neural nets (or humans, really). More information = more knowledge, yes, but also more misinformation, nuance, contradiction, confusion, etc. And until we have quadrillion+ parameter models, these networks involves some lossy compression. New knowledge may distort or erase older knowledge.
If you’ve designed a prompt on today’s model, you’ve done so based on the extent and limits of its knowledge as well as its unique configuration of weights. Your happy path through the network today may not be so happy after retraining. We actually observed this earlier in the beta. OpenAI released a refreshed model which I personally found was better for some of my prompts. But because other customers reported regressed results, they yanked it and it has yet to reappear. I’m sure they learned from that incident and have a plan to deal with versioning.
One example is related to what I’m doing here at wiserize. We use GPT-3 to generate product descriptions. If the model is not aware of the product, there is no way it can generate a factual response.
So at least for factual answering, it will require a model being continuously generated.
Imagine you’re summarizing emails into bullet points. Getting GPT-3 to perform a task accurately and consistently is an art. So you’ve formulated the perfect n-shot examples and tweaked the parameters until you’ve settled on something that is suitable for commercialization. Maybe you’ve even figured out a way to get it to work with curie or babbage. When a re-trained model comes along, there’s no guarantee that your perfectly designed prompt will still work as well as it once did, or even at all with the same model selection and parameters.
I’m definitely not arguing against retrained models. My uses of GPT-3 could also benefit from more current knowledge. But I do feel they need to provide a reasonable deprecation period on older versions to allow for testing and adjustment of productionized prompts.
On that note, it’s fascinating how GPT-3’s brain is pre-pandemic.
I suppose you could database-cache answers you want to be fully deterministic, whereas you might want some flexibility in others precisely to react to new prompts. Imagine, for example, an email needs to be summarized where a lot of the content is about the coronavirus, virus mutants, quarantines, and home schooled kids. To continue on this example, I’ve been told by a native speaker even the meaning of “home schooling” slightly changed since the pandemic (now often meaning “distance learning”).
But yeah, it would be cool to be able to target specific time frames in the model. It might also help with combating racist or sexist answers. E.g. maybe I don’t want 1950s wisdom on gender
It would be interesting to see that. The training information would have to be classified first to establish the epoch ranges.