How long would it take to solve the ASI alignment problem?

The difficulty isn’t just the lack of time; it’s the lethality of mistakes.

By 500 CE, the global community had converged on the theory that the Sun went around the Earth. The competing theory of Copernicus was considered and largely rejected. It wasn’t until Galileo built a telescope and saw Jupiter’s moons — celestial bodies that go around Jupiter instead of Earth — that the budding scientific community was spurred to the conclusion that the Earth goes around the Sun.

Humanity came to the correct theory of orbital mechanics in time. But before that, it came to a false consensus. And it held voraciously to that false consensus until reality started beating Galileo over the head with the fact that the Earth is not at the center of everything.

The usual process by which the scientific community converges on the truth involves steps where the scientific community is wrong and reality beats us over the head with evidence until they update their models.

The trouble with ASI alignment is not just that it is a tricky research program. It’s also that, in this field, what it looks like for reality to really beat humanity over the head with the fact that their first favorite theory was flawed, is for an unfriendly ASI to consume the planet. There would be no survivors to converge on a better theory of ASI alignment.

If humanity had a hundred years and unlimited retries, we probably wouldn’t have much trouble sorting out the ASI alignment problem.

But even if we had three hundred years to develop a theory of intelligence, and of how AIs change as they get smarter, and of how to point them in an ultimately stable way…well, in lieu of the ability to actually try and see what happens when the AI gets radically smarter a few times, we’d very likely converge on the wrong answer, before that vital evidence comes in. Humanity has a tendency to converge on that wrong sort of answer.

Your question not answered here?Submit a Question.