Everyone seems to be complaining about LLMs not actually being intelligent. Does anyone know if there are alternatives out there, either in theory or already made, that you would consider to be ‘more intelligent’ or have the potential to be more intelligent than LLMs?
I think others are mostly addressing that issue, which is why I went a different direction.
Its not really an answerable question, because we don’t even have great definitions for human intelligence, let alone things that are only defined in abstract comparison to that concept.
Transformers, specifically the attention block, is just one step in a long and ongoing traditions of advancement in ML. Check out the paper “Attention is all you need”. It was a pretty big deal. So I expect AI development to continue as it has. Things had already improved substantially even without LLM’s but boy howdy, transformers were a big leap. There is much more advancement and focus than before. Maybe that will speed things up. But regardless, we should expect better models and architectures in the future.
Another way to think of this is scale. The energy density of these systems (I think) will become a limiting factor before anything else. This is not to say all of the components of these systems are of the same quality. A node in an transformer is of higher quality than that of a UNET. A worm synapse is of lower quality than a human synapse.
But, we can still compare the number of connections, not knowing if they are equal quality.
So a industry tier LLM has maybe 0.175 trillion connections. A human brain has about 100x that number of connections. If we believed the number of connections to be of equal quality, then LLM’s need to be able to be 100x larger to compete with humans (but we know that they are already better than most humans in many tests). Keep in mind, a mature tree could have well over 16 000–56 000 trillion connections via its plasmodesma.
A human brain takes ~20 watts resting power. An LLM takes about 80 watts per inquiry reference. So a LLM takes quite a bit more energy per connection to run. We’re running 100x the connections on 1/4th the power. We would need to see an 800% improvement in LLM’s to be energy equivalent to humans (again, under the assumption of same “quality” of connection).
So we might see a physical limit to how intelligent an LLM can run so long as its running on silicon doped chip architecture. Proximity matters, alot, for the machines we use to run LLM’s. We can’t necessarily just add more processors and get smarter machines. They need to be smaller, and more energy efficient.
This is an approximation of moores law, and I’ve mapped on the physical limits of silicon:
So we really “cant” in big wavy parens, get close to what we would need to get the same computational density we see in humans. Then you have plants, where literally their energy is use is negative, and they have orders of magnitude more connections per kg than either silicon or human brain tissue are capable of.
So now we get to the question “Are all connections created equal?”, which I think we can pretty easily answer: No. The examples I gave and many, many more.
We will see architectural improvements to current ML approaches.
This is all good info, thanks.
I just have one minor nitpick:
The math is wrong here. At “rest”, the brain is still doing work. What you call “an inquiry reference”, is just one LLM operation. I’m sure the human brain is doing much more than that. A human being is thinking “what should I have for dinner?” “What should have I said to Gwen instead?” “That ceiling needs some paint” “My back hurts a bit”. That clang you heard earlier in the distance? You didn’t pay attention to it at the time, but your brain surely did. Let’s not begin with the body functions that the brain, yes, the brain, must manage.
So an LLM is much, much, much more resource intensive than that claim.