My wishlist for 2022-23

With the imminent release of my next book, I am shifting back to R&D mode. As such, I am reminded of the current technological limitations of GPT3. So here’s my wishlist for the coming year for. @boris and @luke please take note! I understand that OpenAI tends to operate in secrecy and then drop huge advancements, so I don’t expect any confirmation.


My number 1 wish is that GPT3 could become cheaper. DAVINCI is powerful enough to do most of what I need, but it is prohibitively expensive. The latest TEXT-DAVINCI-001 is fast enough to be useful, so that previous wish of speeding up GPT3 has been granted. Now if we can just make it a little bit cheaper. I assume that GPT3 will become cheaper over time as hardware improves and as more efficient distillations are developed.

My latest instance of RAVEN runs on TEXT-CURIE-001 and only cost $1.50 to run for a couple hours. That pricepoint is getting to within reason for something that is meant to run 24x7. Unfortunately DAVINCI is 10x the cost. Finetuning is also still prohibitively expensive, even on CURIE.

Since RAVEN operates on loops, I have to slow it way down right now (30 seconds between cycles) to save costs. Over time, as costs come down, RAVEN can “think” faster and faster.

I think that making GPT3 cheaper would yield the Jevons paradox whereby usage and demand would go up as it became cheaper and more efficient. In essence, I think it would be a great business decision to lower the cost of GPT3.

Faster finetune model loading

For realtime applications, the delay of finetune models means they are not usable. It’s as simple as that. There seems to be little rhyme or reason as to how long it takes a finetuned model to load, and then it’s also seemingly unloaded randomly. Maybe this has been improved since finetuning was first released but it’s still no good. I need an API to be available instantly and not run the risk of timing out while a model is loaded.

This will be a necessary step before I can implement neural memories (a hypothetical process of compressing AGI memories into the model for instant recall). Faster finetune models would also be critical to offload some of the cognitive functions, such as speech synthesis, but if RAVEN goes hours between speaking, I can assume that the finetune model will be unloaded so when RAVEN needs to say something, the first few lines of dialog might time out.

I have a number of cognitive functions that would benefit from finetuning, such as my recent Core Objective Function experiment, but as it stands right now I can only use these models in theory, not in practice.

In general, the faster that the API spits out results will always help. All the current models produce results many times faster than any human can read or even think, so in that respect, GPT3 is already superhuman. Compounding returns will mean that AGI based on GPT technology will just be that much more powerful than humans.

Bigger windows

The 2048 character limit is one of my biggest constraints with my AGI research. For instance, I can easily fetch hundreds or thousands of memories, but I then also need to summarize and condense them so that they can fit into a single window.

Maybe GPT-4 is around the corner or an EINSTEIN engine? In the ideal world, EINSTEIN would cost $0.000001 per 1000 tokens, run as fast as ADA, and have a 100,000 token window size.

My cognitive architecture works recursively, as a series of infinite loops that all interact with the same data. This means that RAVEN can “think” endlessly, mulling over past actions, planning for the future, and contemplating the Core Objective Functions. The window limit is probably my biggest constraint to how “smart” RAVEN can be. What I am having to do right now is break each task down such that it will fit into a few paragraphs, and then accumulate these as sequential memories. Perhaps there’s nothing wrong with this, but it does make it harder to design a cognitive architecture around a 2048 token limit.

A “Persona” endpoint

This is a big idea I had that would make chatbots way easier. It would basically be a type of finetuning endpoint but that is formatted specifically for adopting specific personalities. The training data would ideally be a combination of chat logs and/or descriptions about a character. For instance, GPT3 already knows a lot about famous characters like Sherlock Holmes and Spiderman, so it can easily adopt those personalities. For the sake of videogames, fiction writing, screenplays, and other interactive tools, it would be awesome to have a dedicated persona endpoint that allows us to construct personalities for chatbots, characters, and AGI over time.

For instance, I am right now developing a RAVEN “persona” (similar to a human ego and superego) and figuring out how to finetune that into GPT3.


Working on a few of these items, thank you for the write up!


AWESOME thanks so much for the confirmation and hard work!!! Looking forward to the results!


If specifically davinci was cheaper especially with the new 4000 token limit that would be absolutely amazing.