First: I’m not an AI expert, I’m just a hobbyist who enjoys using AI to write and illustrate stories, so apologies if this is already a thing.
Like many of you, I was blown away by how much better DALLE-2 was from it’s predecessor. I think one major reason is the ‘diffusion’ method. It’s the difference between writing a story without a backspace key, and writing one over 100+ drafts.
Right now GPT-3 is also linear. It can help complete a thought, and the insert/edit features can even back-fill, but it’s still very much up to a human to do a final edit to ensure the whole piece makes sense.
Would it be possible to apply that same ‘diffusion’ logic to GPT? So, for example, it would write a 5 paragraph essay on some topic, but when it was done it would ‘blur and refine’, ‘blur and refine’ until the thing was polished to a shine. In the case of an image ‘blur’ would mean adding gaussian noise…I suppose for text it might mean rephrasing and restructuring lines, selecting words with slightly different meanings, etc.
This may be of particular importance for Codex, where something like renaming a variable requires simultaneous changes throughout the code.
Idk if this is GPT-4, or if ‘writing/completion’ and ‘editing’ are two inherently different models, but if it’s possible to make the same advances in text that DALL-E 2 made for images, OpenAI can take all my money.