I calculated that overcoming the quadratic attention window problem to reach book-length generations would require an increase of ~3000x in performance – A prospectus for long-form completions. I can’t find good figures, but I think we get there sooner than 2040.
Oh, I definitely believe we’ll get to that point before 2040. Considering the exponential growth in performance between GPT-2 and GPT-3, I anticipate that whatever the next iteration is will knock the socks off all of us. Certainly, there will still be issues with long-term recall, as neural networks have no way to retrieve episodic memories. But still, the larger the context you can give it, the more information it can integrate and work with in one shot.
Altman said in the interview that contrary to popular belief, GPT-4 will not be any bigger than GPT-3 but will use more compute resources. This is an interesting announcement considering the vocal voices against the perils of having large language models and how they disproportionately affect both the environment and the underrepresented communities. Altman said this would be achieved by working with all the different aspects of GPT, including data algorithms and fine-tuning.
He also said that the focus would be to get the most out of smaller models. Conventional wisdom says that the more parameters a model has, the more complex tasks it can achieve. Researchers have been increasingly speaking up about how the effectiveness of a model may not necessarily be as correlated with its size as believed. For instance, recently, a group of researchers from Google published a study showing that a model much smaller than GPT-3 — fine-tuned language net (FLAN) — delivered better results than the former by a large margin on a number of challenging benchmarks.
So it seems like there’s some conflicting information. Maybe the Cerebras dude was spilling some secret beans or something. Anyways, it looks like some neat stuff is coming down the pipeline either way.
Thinking about this - I am wondering if the Cerebras WSE-2 allows for multiple instances of GPT-3 to run on it in parallels. Certainly, a large CPU could run containers. It’s also much more efficient at a lot of tasks than other systems. So now I’m wondering if this partnership will actually mostly be used to drive down the operating costs of running GPT-3? Given how big it is, it could also be used to accelerate training and experimentation. For my use case, it would be far better if it made GPT-3 cheaper and faster. Right now, DAVINCI is prohibitively slow and expensive.
This is the best answer about topic I’ve seen in a long time.
That might also allow the model to be updated on a more regular basis, which would be nice.
I’m really excited to read your book @daveshapautomator thanks so much for sharing your insights on this topic
I also strongly agree with your points. Since we only understand cognition in terms of our own intelligence, emotions and experiences; even our most ambitious attempts to achieve AGI will only emulate what we already know. And as you said, there’s nothing wrong with that.
Now … I do fear for the day we somehow manage to create something that redefines what it means to “think”
.
I added your book to my reading list – it looks very interesting.
Thanks both of you. Yeah I’ve thought about what it might be like to create a machine that “thinks” vastly differently from us but there are still poorly understood mechanisms within our own cognition. If you study things like the unconscious, archetypes, inner critic, and theory of mind, it becomes abundantly clear that we can have multiple entities sharing our head. I wouldn’t be surprised if it comes out eventually that a significant proportion of people actually have multiple “personalities” that they are completely unaware of. Our minds are already quite exotic
Great point! The randomness of human personalities adds a level of complexity that I can’t imagine even a “successful” AGI emulating – beyond simply mimicking a set of clearly defined personality traits. Individual personalities vary so much based on situation, environment, and genetics. No two personalities are exactly the same.
In a fantasy world where you have near unlimited computational power, if you simulate the conditions that lead to the evolution of philosophical sentience with evolutionary models capable of mutating any part of their structure, would that overcome the hurdle of true sentience having evolved through suffering?
You’ll run into the same problems we have today: how can you prove philosophical sentience? How can you prove that anything has qualia? Anyways, I think it would be a huge ethical violation to deliberately create something capable of suffering. Not to mention it would be a horrible idea.
Is creating something capable of suffering in this way ethically different from having a baby, or otherwise causing the creation of a biological organism?
Anti-natalists would say no, there’s no difference. However, I believe it’s a false comparison. We have the option of creating AI that is incapable of suffering, therefore creating a suffering AI would be a deliberate act that amounts to maliciousness. We have no such option to create new humans that cannot suffer.
Amazing! There is a hint quoted in the article about GPT-4 “evolving beyond text to our visual world”, which would be great news for our start-up. We’re building a product that eventually will use AI to generate films (with help from GANs) based on a producer’s ideas.
For now our MVP product just helps with writing scripts and screenplays, using text from GPT-3… but in the future we would love to generate full CGI/animation/deep-fake so you don’t even need to film on set. You wouldn’t even need real actors!
The future is going to be wild. 
Russell
CyberFilm.ai

I am setting up a jupyter demo on GCP for full body DeepFake.
If you want I can add your email or contact so when it is ready u can give a try.
I think we can do natural language to pose conversion to make virtual actors. It is lots of little things to be researched.
Absolutely! Yes please here is my email: russellp@outlook.com
Looking forward to it, I definitely think deep-fakes and CGI are going to replace acting and filming for the most part, like Josh Brolin playing Thanos but on steroids. Eventually won’t even need Josh, just grab one of these “people” from: https://www.thispersondoesnotexist.com/
Or make your own “actor” likeness based on the character you’ve invented and any description you can think of.
The leverage of powerful, greedy, selfish, demanding, complaining Million Dollar actors may be coming to an end! Or at least movies will be a lot cheaper to make without their sky-high salaries. Probably will introduce a lot more diversity too, especially leading roles. You can cast your own movie with any type of person you want!
I covered how you could even adapt GPT-3 to create artificial personalities to have fully self-directed characters, allowing for fully natural (and unpredictable) narratives in my book. I predict that entertainment will be entirely unrecognizable within about 10 years. That means fully personalized media, where even the characters are custom tailored to the viewer’s preference.
That’s so cool! I’ll check out your book, thanks. Wow yes personalized media sounds incredible, imagine AI creating the perfect song for you… I’d pay good money for that! Or wow having movies where the ending changes each time, sort of like the butterfly effect giving the characters a certain amount of free will, maybe Michael Corleone decides the let Fredo live this time around haha.
Yes I can totally imagine everyone on earth making their own personalized movies in 10 years, putting themselves and their friends in the starring role even. I’d love to replace Russell Crow with Russell Palmer (me
) in one of my favs Gladiator, maybe set in the Dune universe instead.
Like the Rick Dalton flashback scene where the actor always dreamed he could have been cast in The Great Escape: The Great Escape (1963) & Once Upon a Time... in Hollywood (2019) Side-by-Side Comparison - YouTube
Mostly I just want more Firefly