To scale Gen AI, should we focus on compute or memory

I wanted to discuss this with others in this forum.
Most of the focus now is on compute for Generative AI. But I see we should focus more on memory architecture than compute. The reason our brain is efficient is because the cortex is not a computer, it is a memory system. So, we need to develop a robust memory system. To build a memory system, I think the best tool is to use a graph db. But what I am not able to decide if we should use RDF or a property graph to develop the memory architecture.

2 Likes

Great point on focusing on memory architecture, akin to the human brain’s efficiency. Thanks for bringing up this topic for discussion.

I sit in the camp of the future of generative AI, and solving scalability issues, hinges on a blend of approaches and there won’t be one solution that rules them all. I tend to have a brevity issue with my forum posts, so I will attempt to just bullet some thoughts below to help spur additional discussion.

  • In Sam’s recent conversation with Lex Fridman, he posited that compute may act as the currency of the future but emphasized smarter allocation of compute based on task complexity. This is an interesting and nuanced approach that could assist with scalability.

  • Innovations such as decentralized agent swarms, GraphRAG and languages like Mojo could offer a more scalable path forward.

  • Techniques like FSDP and QLoRA, as explored in a recent deep dive (Answer.AI - Enabling 70B Finetuning on Consumer GPUs), further highlight the potential for balancing compute and memory effectively.

I think it is really this fusion of ideas, and others as of yet undiscovered, that may well define our strides toward scalable, efficient AI.

3 Likes

According to the The Bitter Lesson compute is unfortunately always ideal

Interesting thought. I do think that this is an oversimplification of how our brain functions however. Going back to the Bitter Lesson:

The second general point to be learned from the bitter lesson is that the actual contents of minds are tremendously, irredeemably complex; we should stop trying to find simple ways to think about the contents of minds, such as simple ways to think about space, objects, multiple agents, or symmetries. All these are part of the arbitrary, intrinsically-complex, outside world. They are not what should be built in, as their complexity is endless; instead we should build in only the meta-methods that can find and capture this arbitrary complexity.

1 Like