Identity, Recombinant AI, and Web3

I believe I have stumbled across the connection between WorldCoin, and AI that will change the way we do Web3


About two weeks ago, I came up with an idea to augment the behavior of a base model LLM, using embeddings, HNSW’s and re-indexing of historical conversations. I proposed that if we took these indexes, and added instructions into the indexed data, could we extract these instructions and alter model performance through prompt chains. Index Prompt Injections. Then a few days ago, theres the GPT-4 paper that pretty much covers my entire idea…but I pivoted.

I then proposed that we take these character tables and continue to re-index them until the embeddings are so robust over time that the millions of choice would inherently constitute an identity.

I mapped out a structure for a Recombinant AI methodology where a robust model evaluates its past choices and instructions, and utilizes it to evaluate its own code and architecture. It will create copies of itself and alter the copy’s source code. It will make incremental changes with the help of human intervention and content moderation, and if it finds that a change results in an increase in performance, it can chose to fork or Recombine with the original build.

In the same vein, you could build isolated Identities and have them interact to improve model performance. I firmly believe that when participating on a team, friction and productive conflict lead to more accurate and powerful results. I found that when I had ChatGPT simulate various identities and divide tasks, that the model pretended to track itself.

I called these Modal-ID’s.

Which leads me to AI and Web3.

If we start tracking our interactions with AI across all platforms, and build our OWN index, and put our historical data at the center, we could effectively use our advanced AI tools to create valuable profiles of our AI footprint.

We could augment this data with traditional information, download it, TOKENIZE IT, and now we own our data on the blockchain. Non-fungible and simple to authenticate our true identities, while also giving us the ability to create personal avatars and altered personas.

I’ve already started working on some of this with my limited skills. Please check out my rudimentary architecture and applications: , , (coming soon), and Recombinant AI/Modal-ID’s.

Thanks for the time you took to listen to my ramblings, but I think I might have something here…or at least intuited a process you may already be implementing. I’d love suggestions, but I’m happy to be part of the discussion.


I’m already doing this somewhat. Record the ins and outs of a human/AI interaction in a database, time stamped to the nearest nanosecond. Have embeddings sitting next to the text. Then retrieve past relevant interactions as well as the most recent. Send all this data back to the model for a new completion.

Doesn’t require blockchain or web3, but gives the appearance of a deeper history/relationship/connectedness.

I see that as larger window sizes roll out, and become more economical to run, this will be widespread.

You can even embed false memories, or interactions that never occurred, to “train” the model without touching the weights.


Dang I never saw this got a response! Is this something I could try out or take a look at?

For the relevant past interactions are you doing any time based clustering to show the model other results? I worked on Windows Desktop Search and we found that the further back in time you go the fuzzier a persons memory gets. Its important that you show a user surrounding results to help facilitate recall. The further back you go the more surrounding results you need to show the user. I’m assuming the model would benefit from the same techniques.

In the version I was describing, it was basically a mix of most recent (truncated to fit your token budget) and the remaining past relevant X top correlations, but also tailored to fit your remaining token budget. So if your remaining token budget has 30% remaining, you stuff everything back in time order, like you said, until you fill the 30%.

But a big blob of text that is much sooner in time would get skipped if it can’t fill in the cracks, while a later blob could if it were smaller.

So it is prioritized by time, but also size. The one’s too big to fit the remaining size are skipped and the older smaller ones are allowed in, even though they are later.

Clear as mud? :upside_down_face:

In the end, you get all the big, most recent blobs, followed by the remaining, later in time, smaller and smaller blobs until you run out of blobs or space.


You know what’s cool? What you just described is almost a perfect analogy for how Top P samples to predict the next word. Same principles, different application! Its neat to see all the mathmatical cross-overs as we figure this stuff out