I’m aware of this, but in the meantime is there any plan/interest to update the data until continuous training arrives. When was the last time the training data was updated at all? It was trained fall of 2019
When you get to the point of releasing model updates, please allow users to pin to specific versions. We may spend months refining our prompts only to find they don’t perform as well on a newer model, which would make building a product on this API very difficult.
Knowledge is not purely additive with neural nets (or humans, really). More information = more knowledge, yes, but also more misinformation, nuance, contradiction, confusion, etc. And until we have quadrillion+ parameter models, these networks involves some lossy compression. New knowledge may distort or erase older knowledge.
If you’ve designed a prompt on today’s model, you’ve done so based on the extent and limits of its knowledge as well as its unique configuration of weights. Your happy path through the network today may not be so happy after retraining. We actually observed this earlier in the beta. OpenAI released a refreshed model which I personally found was better for some of my prompts. But because other customers reported regressed results, they yanked it and it has yet to reappear. I’m sure they learned from that incident and have a plan to deal with versioning.
Imagine you’re summarizing emails into bullet points. Getting GPT-3 to perform a task accurately and consistently is an art. So you’ve formulated the perfect n-shot examples and tweaked the parameters until you’ve settled on something that is suitable for commercialization. Maybe you’ve even figured out a way to get it to work with curie or babbage. When a re-trained model comes along, there’s no guarantee that your perfectly designed prompt will still work as well as it once did, or even at all with the same model selection and parameters.
I’m definitely not arguing against retrained models. My uses of GPT-3 could also benefit from more current knowledge. But I do feel they need to provide a reasonable deprecation period on older versions to allow for testing and adjustment of productionized prompts.
I suppose you could database-cache answers you want to be fully deterministic, whereas you might want some flexibility in others precisely to react to new prompts. Imagine, for example, an email needs to be summarized where a lot of the content is about the coronavirus, virus mutants, quarantines, and home schooled kids. To continue on this example, I’ve been told by a native speaker even the meaning of “home schooling” slightly changed since the pandemic (now often meaning “distance learning”).
But yeah, it would be cool to be able to target specific time frames in the model. It might also help with combating racist or sexist answers. E.g. maybe I don’t want 1950s wisdom on gender