Why did you pick this setup?

Because it’s plausible and easy to write.

Every detail in a story about the future is an opportunity for that story to be wrong. We can’t tell you exactly what technological breakthroughs will happen in what order, any more than we can tell you the exact weather pattern a month from now.

Stories like this aren’t meant to be an exact window into the future. They’re meant to provide an illustration of how the future could go, in a way that ties together all the abstract arguments we made in Part I of the book. Some people find that the danger feels a lot more real when they vividly imagine a particular path the future could take that ends in ruin.

Even more convincing might be ten stories, or a hundred stories, that show how many different pathways lead to ruin, and how the pathways that lead to a thriving future are narrow and fragile.

That’s what it means for an aspect of the future to be an easy call: When almost all pathways have the same endpoint, that endpoint is predictable. But we did not have the time or space to write ten stories, never mind a hundred.

For the story we chose to tell, we stuck to a scenario that starts as soon as possible. This is not because we think a situation like this will definitely arise soon (we are uncertain), but rather because a story set close to the present is much easier to write. If we’d set it even further in the future and made up many more futuristic details about what had happened between now and then, the story would be even more implausible. And those details would just be distracting.

Even if we were somehow able to foresee the exact path that the future would take, it might not be the best scenario for understanding the general dynamics at play.

We expect the true future to be deeply strange, full of messy, contingent details, each of which would strain credulity if placed in a story. A story written that way would be confusing and hard to follow, full of unexplained and unnecessary details, thanks to reality’s disinterest in narrative cohesion. It would also feel less plausible, because many of the details would seem weird.

For a taste of how it might feel, imagine going back in time 100 years and trying to describe the daily lives and big problems of the modern world. Most people in 1925 had never listened to the radio, driven a car, or seen a refrigerator. In order to describe social media, globalization, and obesity, one wouldn’t just need to explain a rich web of technologies; one would need to radically change the listener’s worldview. No, the story we chose to tell is more plausible, and thus less realistic.

There are many other ways the future could go.

Here are just a few alternative possibilities for how a story like this one could start:

There’s some sort of breakthrough in lifelong learning, or long-term memory, or learning more efficiently from data, that yields AIs qualitatively more generally intelligent than any that came before (in the same way that LLMs are qualitatively more generally intelligent than AlphaZero).
Large language models seem to “hit a wall,” AI progress stalls for years, and people say that the hype bubble has popped. But researchers keep tinkering over the following decade, until finally some algorithmic breakthrough is found and the AIs operate qualitatively better than they ever did before.
There’s never any sort of qualitative breakthrough. Progress accumulates slowly and gradually, and AI gets more and more deeply integrated with more and more of the economy, and can handle longer and longer periods of autonomous operation. The AIs often pursue ends that are not quite what anyone intended or asked for, but humanity develops hacks and patches and workarounds. And it’s mostly fine, until on some Tuesday that starts out like any other, the world crosses the threshold past which coordinated AIs would succeed at cutting humanity out of the loop if they tried.

Any given guess about the exact path the future takes is likely to be wrong. It’s nevertheless useful to provide stories that show how it all could hang together.

When the future is uncertain, but all paths lead to the same endpoint, it can be difficult to tell a story that feels compelling. For any given story we could tell, it would be easy to point out a bunch of details that make it implausible. In the scenario we wrote, we tried to emphasize that Sable has many options available to it, and that the story arbitrarily follows one route among many that all lead to the same endpoint.

If you’re unpersuaded by this particular story, we encourage you to write out your own similarly detailed story of how everything goes. In our experience, optimistic stories tend to rely on the AI being unrealistically easy to align (contra the arguments we make in Chapter 4), or unrealistically powerless (contra the arguments we made in Chapter 6). The arguments in Part I are what ultimately carry the case, as opposed to the story details.

Why does Sable end up thinking the way it does?

→