Taking the AI’s Perspective

Seeing the world from a truly alien perspective is genuinely difficult. As a case study in the difficulty, we can cite Jürgen Schmidhuber, a prominent machine learning scientist. Schmidhuber has played an important role in the history of the field, helping invent recurrent neural networks and laying some of the groundwork for the deep learning revolution.

In various papers and interviews, Schmidhuber made the case that AI will be, by default, fascinated by humanity and protective of humans.

Schmidhuber observed that there’s a relationship between science and simplicity: Simpler explanations are more often correct. And he observed that there’s a relationship between art and simplicity: Simplicity and elegance are often considered beautiful. A more symmetrical face, for example, can be considered “simpler” in the sense that you can predict the whole face with less information. Just describe the left side of the face in detail, then say, “The right side is the same but flipped.”

Schmidhuber’s conclusion from all of this is that we should try to build superintelligent AIs that have a single overriding goal: Find simple explanations for everything the AI has seen. After all, such an AI would have some taste for producing science and consuming art. And humans produce both science and art, so wouldn’t it see us as interesting and useful natural allies?

Schmidhuber was right that keeping humans around and paying them to produce science and art is a way to produce science and art. He was also correct that science and art are ways to fulfill a drive for simplicity better than, say, staring at the static on a TV screen. Static is complicated and hard to predict; art and science are a big step up from that.

But Schmidhuber seems to have missed that there are even more effective ways to attain simple explanations of varied sensory observations.

You could, for example, build an enormous number of devices that produce complicated observations from some simple “seed” (e.g., a pseudo-random number generator), and then reveal that seed.

The more such devices the AI creates around it, the better it will do at making novel observations and then finding simple explanations for them. No need for humans. No need for art.

“But, isn’t that sort of…hollow?” a human might wonder.

It is hollow, to human sensibilities. But if the AI’s goal really is just “find simple explanations for its observations,” then a scheme like that one can satisfy this desire thousands or millions of times more per second, in a much more scalable way, than keeping living humans around and having conversations with them. An AI like that does not choose actions that steer away from a sense of hollowness, it simply chooses actions that steer toward finding simple explanations for its observations. And it can get quite a lot of those without any humans at all.

It seems to us that ideas like Schmidhuber’s reflect a common mistake people make when attempting to reason about minds unlike their own. People often don’t truly adopt the perspective of a non-human mind. Instead, they let preconceptions and biases anchor them to the narrow set of options that a human would be interested in, if we were trying to make predictions about a human who really likes simple explanations.

We would guess that Schmidhuber observed that simplicity is related to science and art, and saw how an AI that was steering toward simple explanations could get a little of what it wanted by steering toward by being friendly and nice to have around. It’s not hard to leap from there to a conclusion that feels pleasant to imagine: that if we just made AIs care about finding simple explanations, we would usher in a marvelous future full of all the things that we value in life.

But — we’d guess — Schmidhuber never once put himself in the AI’s shoes and asked how to get even more.

We doubt he ever asked: “If what I really, truly wanted was simple explanations for my observations, and I didn’t care about human stuff, how could I get as much of what I wanted as possible, as cheaply as possible?”

It can be hard to do this kind of perspective-taking. It’s not something people normally need to do in their lives. Even when we’re trying to understand people who are very different from ourselves, there’s an enormous amount that all humans share in common, which we can normally take for granted (and that we practically have to take for granted, when predicting other humans). But AIs, even superintelligent ones that can do science and art, aren’t humans.

The art of considering some objective X and asking “How could I get even more of X, if X was all I truly cared about?” won’t let you figure out exactly how a superintelligence would solve a problem, since a superintelligence could come up with an even better option than the one you came up with. But it can often let you figure out how a superintelligence wouldn’t solve a problem, when even you can see a way to get more X than you’d get by just letting humans walk around having a good time.

One of the rare fields of science that engages with powerful non-human optimizers on a regular basis is evolutionary biology. Early in its history, that field struggled a little to come to terms with just how inhuman non-human optimizers can be; we can draw some useful lessons from a case study there.

You may have heard of predator-prey boom-and-bust cycles. A wet year leads to a boom in the rabbit population, which leads to a boom in the fox population — until the foxes overpredate, and the bunny population collapses, followed by lots of foxes starving to death.

In the early 20th century, evolutionary biologists pondered the question of why foxes didn’t evolve to moderate their predation, so as to avoid population collapse. After all, wouldn’t the fox population as a whole be more healthy if it weren’t regularly dealing with famine and mass death?

The answer to this puzzle is that moderation might be better for the fox population as a whole, but eating extra bunnies and having extra kids is better for any individual fox. Even if the population collapses and most of the individual’s cubs die, that individual still tends to get a higher proportion of their genes in the surviving fraction of the next generation.

The genetic selection pressures on individuals turn out to dramatically outweigh the genetic selection pressures on groups in almost all cases. And so the “greedy” genes propagate, and the boom-bust cycles continue.

Evolutionary biologists solved this riddle theoretically, but that didn’t stop them from putting their theory to the test. In the late 1970s, Michael J. Wade and his colleagues created artificial conditions under which the group selection pressures dominated the individual pressures. They had to work with a species of beetles, which have much shorter generations than foxes, but they succeeded at breeding beetles that kept their population growth in check.

Can you guess how those beetles managed to keep their population growth down? Was it by finding a way to live in beautiful harmony with nature? Was it by learning to abstain from greedily grabbing up too much food?

No. There was high variance, but no beetles abstained from food. Some beetles got worse at laying eggs. Some beetles spent a longer time in childhood. And some beetles became cannibals with a special preference for feasting on larvae (insect infants).

“Create cannibals with a sweet tooth for infants” is, thankfully, not the way a human would solve the problem of overpopulation, if we needed to solve it.

But natural selection is very much not a human. The solution was horrifying, because nature wasn’t trying to find human-palatable answers. It was just trying to find an answer.

“Maybe evolution will produce species that live in beautiful harmony and balance with nature.” “Maybe AIs that care about nothing except simplicity will love humans and coexist with us.” It’s easy for us to imagine solutions that flatter our sensibilities. But those solutions aren’t actually the most effective solutions to the stated problem.

They’re better solutions, perhaps, to a human eye. But non-human optimization processes aren’t looking for solutions that humans think are good. They’re just looking for what works, without any of the baggage that humans carry around to filter for nice answers.

The hypothesis that non-human optimizers produce humane results has been tested, and found wanting.

Notes

[1] reveal that seed: I (Yudkowsky) presented this counterargument to Schmidhuber in a live Q&A after Schmidhuber’s talk on the subject at the 2009 Singularity Summit, a conference hosted by MIRI (which was then called the Singularity Institute).

[2] genetic selection pressures: The topic of individual selection versus group selection used to be a fierce debate. A consensus eventually emerged that substantial group selection pressures occur rarely at best. George C. Williams’s book Adaptation and Natural Selection was widely considered clarifying, on this issue.

Humans Are Almost Never the Most Efficient Solution

→