Is intelligence understandable in principle?

Probably.

Back before the days of biochemistry, you could have asked, “Is it even possible to understand this vital force that animates flesh? Even if it is made of comprehensible parts, why would you believe that our tiny little minds could comprehend what’s really going on in there?”

But there was plenty to understand; human scientists just didn’t understand it yet. This story has repeated itself throughout the history of science.

Also, various tiny parts of artificial neural networks have already been understood. A small neural network turns out to do addition in an interesting way. AIs sometimes say that 9.11 is greater than 9.9, and people have figured out that this is because they’re thinking of dates rather than decimals.^*

But we can’t answer questions much more complex than that. Nobody knows exactly why LLMs make the chess moves that they make; nobody knows precisely what causes them to occasionally threaten and blackmail reporters. But that doesn’t mean there’s nothing to be known. When AIs work, they work for reasons; they operate too consistently across too many domains for it to just be chance. Those reasons are waiting to be understood.

For more on this topic, see the extended discussion titled “Intelligence Isn’t Ineffable.”

* For that matter, when small neural networks malfunctioned in the 1980s, researchers would sometimes print out the entire model’s weights on paper and study them until they figured out that (for example) the model was getting stuck in a local equilibrium. Back when AIs were small enough to be understood, nobody argued that there was nothing there to understand.

But some AIs partly think in English — doesn’t that help?

→