I’m going to share one of the ideas that I’m most excited about for the potential use of these models and that’s as a universal concept translator.
I spend a lot of time thinking about language and when you get to the root of what language is you realize that it’s just a compression protocol. The ultimate goal of language is to transmit an idea, concept, or thought from one person to one or more other people. I’m doing that now. I’m using language to transmit an idea in my head to you the reader. The thing about language is that it’s highly compressed and the algorithm that’s needed to both compress it and decompress it are based off a set of priors we call world knowledge. If I say “Phil Donahue died this weekend” I can assume you have a similar world knowledge and you know who I’m talking about and that I’m referring to an event that happened in the past. If your world knowledge doesn’t fully align with mine you may be able to decompress part of that but you’ll ask for clarity around the parts you didn’t understand “oh really who was that?” We’ll often use things like analogies and examples as a way of tuning the compression algorithm on the sending side to help give “the audience” a better chance of successfully decompressing language to concepts in their head.
Another example; my coworkers and I can have a really “high bandwidth” discussion about programming because we all have a very similar set of priors we can lean on to decompress what each other is saying. To my wife it all sounds like gibberish but she can have a high bandwidth discussion with her colleagues about medical topics that mostly sounds like gibberish to me. So we don’t just have one compression/decompression algorithm for language. We have many.
So the idea… one of the most amazing things about these LLMs is their ability map language to virtually any concept. They know everything and they were originally designed for translation so it’s not surprising that they’re really good at taking the concepts for a complex topic like “multi attention heads in large language models” and compressing those concepts into language that a 5 year old could decompress and understand.
Recently I’ve made some progress on a prompting technique I call lenses which is just a simple way to shape the answer you get out of the model. Nothing radical here you’re just mixing into the prompt some instructions that say things like “always write your answer for a typescript developer with 30 years experience. When generating code use typescript unless another language is asked for.” Lenses are basically a better approach to the memories feature that ChatGPT is experimenting with (I turned memories off.)
What if you could create a lens that automatically re-writes everything you read or that someone says to you to better match your world knowledge? Basically everything you consume would be custom tailored and matched to your personal world knowledge making it easier for you to decompress (or easier to grok.) My bet is that the rate at which we could transmit information using language would increase 10x and the comprehension of the ideas being transmitted would increase 100x.
Customizing communication to fit each person’s knowledge could revolutionize how we understand and share information. The potential to increase both the speed and depth of learning is huge. Definitely worth pursuing
You could build on the idea that everything you said to the chat it could learn and remember and use this as ‘your world knowledge file’, it would then be programmed to give you information in a way that you would understand.
Of course, you could build a ‘my world knowledge file’ but it would take a long time and then fine tune a model on it AND create a vector file of the knowledge you have, you could then instruct the model to communciate things to you with this data in mind, stuff you DONT know is translated into a way you would understand based on what you DO know.
Thanks for sharing your thoughts, it’s got the makings of a thesis and a project.
Your idea about using language models as universal concept translators is fascinating and reminds me of a similar concept explored in Neal Stephenson’s science fiction novel “The Diamond Age: Or, A Young Lady’s Illustrated Primer.”
In “The Diamond Age,” Stephenson introduces the concept of “The Primer,” an interactive book that adapts its content to the reader’s level of understanding and personal context. The Primer uses advanced nanotechnology and artificial intelligence to tailor its lessons and stories to the individual reader, effectively serving as a personalized educational tool that can teach complex concepts by breaking them down into understandable components based on the reader’s existing knowledge and experiences.
This is similar to what I have been calling “projections”. Basically feeding a high level shaping instruction into the System message to morph from one domain to another.
Ironically I was calling it projections for a while as well. I recently switched to lenses when I started thinking about it in agentic workflows. You can actually pass a corpus through multiple lenses in a loop to look at the corpus from multiple angles.
Lenses if you are trying to “look” at it from different angles.
Projections if you are trying to shape it to something else … think of a forcing function to drive a conversation to a desired outcome, or have a certain persona when generating a specific set of content, etc.
This maybe related to Transmediality or transmedial storytelling, where a concept / story is retold in different technical or stilistic media.
The content / story changes its character as soon as it is retold with the techniques and tropes of the respective medium. It may then a) appeal to a different audience, based on the audiences preconception of a preferred medium (e.g. retelling a political thriller as a children’s fable) or b) broaden the decoding range of the story’s original audience by introducing it to codes of a different medium. The knowledge of the original story helps the audience in decoding and appreciating new codes (e.g. Postmodern Jukebox translating current hits into muscial genres appealing to elder and younger people alike).
I agree that this is a huge, tappable ressource that can be more easily opened with LLMs.
Have you considered the limited scalability of customization in this application? The more you customize to the world knowledge of the user, the more you must diverge from the original concept. Of course the risk is greater, the greater cultural and experiential difference between the concept source and the user, but in real world language translation, this is a common problem–world knowledge (and therefore available linguistic repertoires) regularly diverge enough so that there is no overlapping terminology for certain concepts. The result is that these gaps are overcome through human relationships held overtime, where concepts accumulate highly contextual additional meaning (including emotional emphasis). How would you account for this growing risk of error?
I think each user would have multiple lenses that they can leverage based on the content they’re consuming. For example, I created a gamer lens yesterday that changes the shape of answers for gaming related queries. In testing it, I noticed that if you ask for help deciding between a new playsation or xbox it shifts the answer to focusing on the technical specs and exclusive titles for each. The boilerplate answer is way more generic. Likewise, you can ask for help in defeating a boss in god of war and what you get is really detailed tips for leveling up your character and strategies for fighting that boss.
I think users would have dozens of lenses that the model would automatically select an apply relative to the question being asked or the content being red. This should help reduce the saturation of trying to put everything into one lens. That’s actually the problem with ChatGPT memories. They’re trying to shove everything into one lens that they pass every question through.
That’s one of many irons in one of many of my fires that I hope will eventually converge
I don’t think of the concept as lenses, but more along the lines of a personalized food taster - for information. Filtering and re-aggregating is more important than reprojecting.
I disagree that they should be discrete systems. I don’t actually think that saturation is that big of an issue (although I can see how it could be a challenge with openai’s tooling, they’re a little behind tbh, but still doable)
One moral challenge with this is that such a system gets to know you very intimately. The associated opportunity would be that it could obviously be used the other way 'round too. Zero latency interplanetary communication? (but more likely, you’d just be focus grouped and served incredibly targeted ads)
But yeah, seeing the potential of this tech as a translator - not between macro cultures, but rather between cultural quanta (i.e. individuals) - is a good idea IMO. Aza Raskin would likely agree
The lens analogy is more around how I personally came to the idea. I’m working on other projects that require re-projection in f information so I added a lens concept to my engine to support those re-projection tasks. I then noticed that you can use the same lens approach to reframe concepts that you’re struggling to grasp.
I was struggling to understand how multi-headed attention works in transformers but I was able to get the model to reframe the concept in way that was easier for me to grok.
In this way, you lock yourself in a bubble of your own knowledge and stop developing. The brain learns best when it solves a problem and has to put in some effort to do so. What comes easily is quickly forgotten.
This is intriguing! Indeed, just as AI excel in translation between human (or machine) language, it should be able to translated between levels of expertise and abstractions.
But isn’t this already implied in ChatGPT’s ability to “explain evolution to a four-year-old”? More generally, the LLM is adept in shifting the level of explanation between audiences.
Congratulations on your idea! I like it.
However, I see a potential issue: the notion that a model can always adapt language to an individual’s knowledge of the world. A famous Italian semiologist, Umberto Eco, emphasized that meaning is not only found in the structure of language but also in the socio-cultural context and the subjectivity of the interpreter. Excessive adaptation could lead to the loss of this richness of meaning and the trivialization of concepts.
Hi Steven: I very much enjoyed reading your ideas and it triggered a cascade of thoughts in my mind on the nature of the project you seemed to have embarked upon. As a disclaimer, I am not current with the field nor was I ever much of a programmer, but I am someone that has been thinking about this and related subject for many decades and am now writing a book on the interaction between human cognition and the evolution of human society. But what I wanted to start with here is to relate some natural language processing work I did in the mid to late 1990s. My background at that time was in quantitative finance and among other things I built market indexes. I had always been interested in what we then called the “qualitative-quantitative frontier”. I was seeing digitalization conquering knowledge problems in one field after another - then in economics and finance and it made me wonder about law and politics. And at that time I decided to work on a project for machine reading of newspapers - a tedious task I always had to do - and that led me to consider what it meant to “understand” what was in a newspaper article and I learned the basics of natural language processing. But my further goal was to be able to “read” an entire newspaper and then compile some sort of summary meanings from that. Now we would call it meta-data. And so the question became: what did it mean to understand what 100 articles meant. That is where it became a non-trivial problem. And at that time I found that after constructing basic word frequency processes including stemming and elimination of words unconnected to meaning, I started to develop what you now call lenses or specialized filters for specific subjects as well as ways of assessing sentiment. However jumping forward 20 ears to the present, I am still thinking about language and specifically the thesis that population density increased social interaction (and social skills) and that spurred use of language, which catalyzed thought, which made human culture more complex and among other things spurred innovation - which closed the loop by enabling population increase - a virtuous cycle that has lasted for 10,000 years - though is now likely coming to an end. But it hypothesized a link between language, thought and population growth. So thought could be emergent from intensified communication, but that it required population growth - unless AI can take the place. But we cannot keep growing population. But the further work made me try to separate the elements of thought from those of language. It is peculiar because they are largely two different things but have been intrinsically linked at a deep level because of human sensory preferences for auditory information and later for visual information - into writing. However, because they are different things language causes inconsistencies in thought of many types and many of them are profound. I will try to stop here - focusing on the fact that language induces flaws in reasoning, which contribution to human reasoning being flawed - which we can call “non-objective” or not universal. It remains a cultural artefact though with some aspects that appear to give it universality, but that is likely to be of a bounded kind. I will stop here and apologize if it seems off-topic to you.