Are you saying we need provably safe AI?
No.
We aren’t advocating that humanity wait for a literal proof that some artificial superintelligence will be good, or anything like that. Such a proof is probably not possible even in principle, never mind in practice. As Einstein said in his 1921 lecture, Geometry and Experience: “As far as the laws of mathematics refer to reality, they are not certain; and as far as they are certain, they do not refer to reality.”
Any supposed proof about how an AI will behave in the real world is not guaranteed to govern the AI’s actual behavior, because we might be wrong about how the real world works.
That’s already true of computers today. For example, you might think that if someone has a literal mathematical proof that, according to the theoretical behavior of transistors and the circuit diagram of a computer, it’s impossible for a computer program to change the memory in cell #2, then the computer program cannot chance the memory in cell #2. But the “rowhammer attack” involves rapidly changing the memory cells #1 and #3 on either side of the protected memory cell, in a way that turns out to electromagnetically perturb cell #2 in the middle, changing a piece of computer memory without ever writing to it directly. Real physical transistors are not mathematically perfect transistors, and proofs that look comforting in theory don’t always matter much in practice.
We aren’t demanding mathematical proof that things will go well. It’s not possible to meet such a standard in real life, and even if it were, it probably wouldn’t be worth the cost. We approve of society taking justified risks. The argument we’re making is not that there’s some tiny amount of risk that’s hard to dispel, it’s that there’s an extreme danger bearing down on us.
Growing an artificial superintelligence animated by drives that relate only tangentially to its operator’s intentions is the sort of thing that goes wrong by default. It’s not that there’s some small chance of things going wrong, but we should pay attention to this risk out of an abundance of caution. The book isn’t titled If Anyone Builds It, There’s A Tiny Chance We All Die, But Even A Tiny Chance Is Worth Mitigating. If we rush ahead at this level of knowledge and ability, we will predictably all die, because we’re just that far off from being able to create vastly superhuman AIs that are friendly.
If AI were analogous to automobiles, we wouldn’t be saying, “This car has faulty seatbelts and airbags. Let’s pull over out of an abundance of caution.”
We’d be saying, “This car is careening toward a cliff. Stop.”
It’s not about “safety proofs.” It’s not a “tail risk.” Scientists are not ready to face this challenge. We’d just die.