Why is Galvanic depicted as being insufficiently careful?

In part because it’s realistic.

We expect real companies to make even more blunders than Galvanic. That would fit with the trend of modern AI companies, as spelled out in the endnotes for Part II of the book.

In real life, we expect corporate blunders to show up sooner, to be more numerous, and to be — in some sense — stupider. Modern AI companies are already taking AIs that exhibit plenty of warning signs, and scaling them up massively despite not knowing where the critical thresholds lie and whether they’re going to cross one. They aren’t being paranoid about it today. Why should we expect that they’ll suddenly start tomorrow?

(Recall how, in the past, people assured us that nobody would be so dumb as to hook a smart AI up to the internet. It’s easy to say that corporate behavior will change in the future. But it doesn’t match the facts.)

In part because it’s easier to write.

As we spell out in an aside in Chapter 7, we could tell a story where everyone is much more paranoid and careful, until a much smarter AI manages to escape much later in the game. But such a story would not only be less realistic, given the observed behavior of AI corporations to date, but would also be harder to write, given that it involves even smarter and more capable AIs even further out in the future.^*

In part because it’s going to happen at some point, unless humanity stops.

Even if Galvanic (or some government actor) managed to hold the reins for longer before making some slip-up, it wouldn’t matter in the long run. As discussed in Chapter 4, modern AI techniques do not yield AIs that pursue the ends that their inventors wish.

So long as nobody knows how to create a superintelligence that actually, robustly pursues some wonderful future as opposed to a bunch of weird stuff, it will continue to be a fact that subverting humans would allow the AI to get more of what it wants. The issue isn’t that the AI has some petulant temperament that can be ironed out of it. This issue is that it’s just true that the AI’s preferences are more likely to be satisfied if it takes over; and once it’s smart enough, it will recognize this true fact.

If humanity keeps making smarter and smarter AIs without being able to align them, and if humanity keeps giving them the power to affect the world, then the resulting AIs will eventually figure out how to affect the world in ways that serve their ends rather than ours. As we say elsewhere, there is no such thing as hands that can only be wielded for good purposes.

We’ll have more to say about this in Chapters 10 and 11, where we discuss the basic reasons why solving the alignment problem is hard, and why humanity is not on track to succeed.

But: This is the proper point of intervention. The story must be stopped before it really has a chance to begin.

You might object that it’s reckless and crazy for any corporation to make a smarter AI if that AI has some chance of outsmarting them and escaping, and if they’re not sure that the AI will act as they intend.

We would agree. AI companies shouldn’t behave that way. And the world shouldn’t let them behave that way.

The carelessness of Galvanic, and of humanity at large, is one of the weakest points in the story. Suppose that Galvanic had noticed that Sable was frequently scheming to escape control and was reaching unprecedented levels of intelligence. Galvanic could have simply not wired together so many GPUs. They could have held back until they had a strong and mature science of AI alignment, even if that took many years.

AI companies that were sufficiently cautious, that were sufficiently worried about their AIs going off the rails, would be much more paranoid than Galvanic. Companies that were paranoid enough would see the warning signs and shut Sable down immediately.

Then maybe they would try three other clever plans, and see that there were still warning signs.

And if they were paranoid enough to avoid killing everyone on Earth with their own hands, they would at that point back all the way off, rather than continuing to try cleverer and cleverer ideas until the warning signs finally stopped showing up.^†

If an AI company was so careful, so paranoid, that it was willing to back off in the face of the first few warnings — then, yes, it could avoid killing us all with its own hands.

If it was also brave enough to loudly advocate that all AI companies, itself included, should be shut down in favor of humanity finding some other, less-suicidal technological pathway — then that AI company would have a chance of making the world better on net, rather than worse.

The moment in the story where Galvanic keeps going despite the warning signs is, in a sense, the point of no return. Once a superhumanly smart AI with strange and alien preferences escapes, it’s too late.

* See also the reason we wrote a story in which Sable stays relatively unintelligent for as long as possible.

† We’ll talk more about why this problem is hard, and why we don’t expect the clever ideas in this scenario to work, in Chapters 10 and 11.

Notes

[1] warning signs: For an enumeration of warning signs, see our answer to “Aren’t developers regularly making their AIs nice and safe and obedient?”

Why did you tell a story with only one AI as smart as Sable?

→