Making Sense of the Death Race

A natural question we expect from many readers is:

You say that if anyone builds ASI, everyone dies. But then why is anyone trying to build it? If you’re right, these people aren’t even following their own incentives, ultimately. If everyone dies, they die too.

A cynical, game-theoretic rejoinder might go like this:

Why, it is rational given their incentives. If they don’t build it, they assume someone else will. And they might as well get rich before they die.

Maybe that answer is enough, for a cynic.

Simple game-theoretic explanations like this often misunderstand or oversimplify real human psychology, but this explanation may also contain a grain of truth. An engineer may think that probably everyone will die from ASI, but that their own actions don’t affect that probability much. Meanwhile, they get to have insane amounts of money, cool toys, and powwows with big, important people looking at them respectfully. Maybe they’ll become the god-kings of Earth if ASI doesn’t kill everyone, but only if their company wins the race to build ASI…

From the perspective of an OpenAI researcher who recognizes the danger: If they don’t work for OpenAI, probably OpenAI destroys the world anyway. (Even if OpenAI shut down, Google would destroy the world anyway.) But if they do work for OpenAI, they get six-to-seven-figure salaries, and if they don’t die, perhaps they’ll accrue extra power and fame by being on the winning team. So each individual’s personal game-theoretic incentives push them toward collectively destroying the world.

Our view is that this sort of explanation is somewhat overdoing things, and we mention it mainly because there’s a sort of person who believes (much more than we do) that the world must run on explanations like that one. We also feel a need to mention it because some people at AI labs explicitly say that a race to the bottom is inevitable, so they might as well add fuel to the fire and have fun themselves.

After previously warning that AI “is far more dangerous than nukes,” Elon Musk decided to start an AI company and enter the race himself, stating in June of 2025:

Part of what I’ve been fighting — and what has slowed me down a little — is that I don’t want to make Terminator real. Until recent years, I’ve been dragging my feet on AI and humanoid robotics.

Then I sort of came to the realization that it’s happening whether I do it or not. So you can either be a spectator or a participant. I’d rather be a participant.

And:

And will this be bad or good for humanity? Um, it’s like, I think it’ll be good? Most likely it’ll be good? But I’ve somewhat reconciled myself to the fact that even if it wasn’t gonna be good, I’d at least like to be alive to see it happen.

So this is clearly part of the story.

But we don’t think this is the biggest factor explaining the behavior of most of the labs. We don’t think this is the only thing going on in Musk’s case, and we don’t think it’s representative of every tech CEO or scientist racing to the precipice. Humans are a bit more complicated than that.

The Banality of Self-Destruction

What, then, is the main thing that’s going on? How could engineers pursue some dangerous technology, even to their own deaths?

The simple fact is that history shows it’s no anomaly at all for mad scientists to kill themselves by mistake.

Max Valier was an Austrian rocketry pioneer who invented a working rocket car, rocket train, and rocket plane, all by 1929, catching the attention of the world. He wrote of exploring the moon and Mars and gave hundreds of presentations and demonstrations in front of thrilled audiences. One of his experimental rocket engines exploded in 1930, killing him. His apprentice developed better safety precautions.

Ronald Fisher was a renowned and eminent statistician, one of the founders of modern statistics. His findings were used to argue before Congress in the 1960s that the evidence did not necessarily show that cigarettes cause lung cancer, because correlation did not imply causation; there could always be some gene that both made people like the taste of tobacco and also get lung cancer.

Did Fisher know his statistics were bullshit, on some level? Maybe. But Fisher was a smoker himself. He died of colon cancer, which long-term smokers get 39 percent more often than non-smokers. Was Fisher killed by his own mistakes? All we know is that there is a statistically decent chance he was, which seems almost fitting.

Isaac Newton, the brilliant scientist who developed laws of motion and gravity and who laid many of the early foundations for science itself, spent decades of his life on fruitless alchemical research and was driven to sickness and partial insanity by mercury poisoning.

And poor Thomas Midgley, Jr., discussed in the parable for Chapter 12, certainly gave himself quite a bit of lead poisoning with the same lead he insisted was safe. As you can see, it’s just not all that rare for enthusiastic engineers to harm themselves with their own inventions, either through recklessness or delusion or both.

Shrugging at the Apocalypse

Fisher, Newton, and Midgley deluded themselves into thinking that something dangerous was safe. That’s a perfectly normal way for scientists to end up doing something self-destructive. Unfortunately, the story with AI labs isn’t quite so simple.

Not all AI company CEOs deny that smarter-than-human AI is a threat. Many explicitly acknowledge the danger and talk about reconciling themselves to it. Corporate executives at many of the frontier AI labs are on the record saying the technology they’re developing has a substantial chance of killing every human alive.

Shortly before co-founding OpenAI, Sam Altman wrote: “Development of superhuman machine intelligence is probably the greatest threat to the continued existence of humanity.”

Ilya Sutskever, who recently founded “Safe Superintelligence Inc.” after parting ways with OpenAI, said in a Guardian interview:

The beliefs and desires of the first AGIs will be extremely important. And so it’s important to program them correctly. I think that if this is not done, then the nature of evolution, of natural selection, favors those systems prioritizing their own survival above all else. It’s not that it’s going to actively hate humans and want to harm them. But it is going to be too powerful.

Google DeepMind co-founder and scientist Shane Legg said in an interview that his probability of human extinction “within a year of something like human-level AI” was “Maybe 5%, maybe 50%.”

The actions of the labs, however, seem remarkably out of step with these extreme-sounding statements.

In a few cases, scientists and CEOs have explicitly said that creating AI is a moral imperative of such a high degree that it’s perfectly acceptable to wipe out humanity as a side effect. Google co-founder Larry Page had a falling out with Elon Musk over whether human extinction was an acceptable cost of doing business in AI:

Humans would eventually merge with artificially intelligent machines, [Larry Page] said. One day there would be many kinds of intelligence competing for resources, and the best would win.

If that happens, Mr. Musk said, we’re doomed. The machines will destroy humanity.

With a rasp of frustration, Mr. Page insisted his utopia should be pursued. Finally he called Mr. Musk a “specieist,” a person who favors humans over the digital life-forms of the future.

And Richard Sutton, a pioneer of reinforcement learning in AI, has said:

What if everything fails? The AIs do not cooperate with us, and they take over, they kill us all. […] I just want you to think for a moment about this. I mean, is it so bad? Is it so bad that humans are not the final form of intelligent life in the universe? You know, there have been many predecessors to us, when we succeeded them. And it’s really kind of arrogant to think that our form should be the form that lives ever after.*

Even more common, however, are scientists and CEOs who don’t think it would be a good thing for AI to destroy humanity, but who seem to treat it as shrug-worthy, as something other than an incredible emergency, that AI poses this extraordinary threat.

In a recent interview, Anthropic CEO Dario Amodei commented:

My chance that something goes quite catastrophically wrong on the scale of human civilization might be somewhere between 10 and 25 percent. […] What that means is that there’s a 75 to 90 percent chance that this technology is developed and everything goes fine!

This looks to us like a radical case of scope neglect, with all the hallmarks of a dysfunctional engineering culture. We can compare this way of thinking to, e.g., the standards structural engineers hold themselves to.

Bridge engineers generally aim at building bridges in such a way that the probability of serious structural failure over a fifty-year timespan is less than 1 in 100,000. Engineers in typical mature, healthy technical disciplines see it as their responsibility to keep risk to an exceptionally low level.

If a bridge’s chance of killing a single person were forecasted to be 10 to 25 percent, any sane structural engineer in the world would consider that beyond unacceptable, closer to a homicide than to normal engineering practice. Governments would shut the bridge down to traffic immediately.

AI researchers, in contrast, are accustomed to gathering around water coolers and trading “p(doom)” numbers — their subjective guess at how likely AI is to cause a catastrophe as serious as human extinction. These probabilities tend to be in the double digits. The former head of OpenAI’s superintelligence alignment team, for example, said that his “p(doom)” falls in the “more than 10 percent and less than 90 percent” range.

These numbers are ultimately just researchers’ guesses. Maybe they’re nonsense, maybe they’re not. Regardless, it’s remarkable how culturally normal it is, in the field of AI, to expect your work to have a substantial chance of causing the deaths of enormous numbers of people.

The idea of applying odds like that to the survival of the entire human species, and proceeding with the work anyway, would genuinely be hard for most civil engineers to wrap their heads around. The situation is extreme enough that we’ve encountered many people who doubt these scientists and CEOs could possibly be serious in their risk evaluations. Yet the arguments in If Anyone Builds It, Everyone Dies suggest that AI CEOs are, if anything, lowballing the danger.

Researchers at these companies are acclimated to risk levels that would be shockingly absurd by the standards of a bridge engineer. It’s difficult to understand, otherwise, how a CEO like Amodei can smile while calmly reassuring viewers that he thinks the odds of AI research causing civilization-level catastrophes are “between 10 and 25 percent.”

Living in Dreamland

One part of the puzzle, as discussed above, seems to be a cultural normalization of extreme risk.

Another part is a deadly stew of optimism bias and attachment to bright, hopeful ideas — the kind of error cognitive psychologists dub “the planning fallacy.”

It’s not all that surprising for the CEO of a bold new startup to overestimate their chances of success. That sort of person is more likely to take a stab at a problem in the first place.

The difference with AI isn’t that there are especially reckless people at the helm. It’s that the consequences of failure are much more dire than usual.

It’s common wisdom that you can’t trust a contractor when they say there’s only a 20 percent chance their giant bridge-building project will run behind schedule or experience cost overruns. That’s not how complex projects work in real life. There are going to be obstacles and surprises.

Maybe a veteran contractor backed up by years of experience and statistics could tell you that one in five of their bridge products experience some sort of overrun, and you might be able to trust that. But imagine instead that a bridge contractor, wanting to reassure you, said: “We don’t see any reason why this project might get difficult. This is our very first project, yes, but we think everything’s going to go fine. All of those engineers sending you serious letters about out specific problems with setting up the retaining walls and digging in this particular area — they’re just negative Nancies, and you should ignore them. Sure, there’s always some chance of an issue; but we’re realistic and humble first-time bridge builders. We think there’s maybe a 20 percent chance that this project runs into obstacles and surprises, at worst.”

In a case like that, numbers like “20 percent” sound to us like the sort of thing someone says when they can’t deny that there’s some risk, but they don’t want to worry people. They don’t sound like estimates that are grounded in reality.

Aligning a superintelligence on the first try looks much more complicated than constructing a bridge, which is something humanity has done thousands of times before.

Even in a mature and technically grounded field like bridge-building, the kind of talk that we see from AI labs would be a bad sign about whether those “20 percent chance this goes poorly” estimates are grossly optimistic. In a field without that grounding, where exciting ideas are free to proliferate without ever coming into contact with harsh realities, that kind of talk is a sign that nobody is anywhere near close to success.

And that kind of talk is utterly ubiquitous in AI among the subset of researchers and executives who are even willing to broach the topic of what happens if they succeed in their endeavors.

AI corporate leaders can’t spell out a plan for success that is even mildly detailed — a plan that addresses the key technical hurdles and difficulties that have been known in the field for over a decade.

Instead, corporate CEOs tend to be enamored with some high-level idea for why the problem isn’t going to be any trouble for them at all — an exciting vision that’s meant to trivialize all the engineering problems, like the visions we discussed in Chapter 11.

This, too, is a common pattern among human engineers. Unwarranted optimism about a pet solution (that won’t actually work) is something you see all the time, even among people who are otherwise geniuses.

Linus Pauling, one of the founders of molecular biology and a Nobel laureate in two different fields, advocated vitamin C megadosing as a cure for everything from cancer to heart disease; his insistence on this approach in the face of contrary evidence led to the creation of an entire industry of fake medicine.

Electric entrepreneur Thomas Edison, wanting to discredit his competitor’s alternating-current wiring in favor of Edison’s own direct-current designs, decided it would be a good PR move to pay an engineer to electrocute dogs. This tactic, shockingly, did not endear him to the public, yet Edison continued the practice even after a barrage of outrage.

Napoleon Bonaparte, a military genius by most accounts, precipitated his own downfall with a disastrous invasion of Russia. His mistake was not a lack of preparation, as he studied the region’s geography and spent nearly two years on the logistics of the campaign. His strategy required forcing the Russians into a decisive battle before his thirty days of supplies ran out. The Russians did not cooperate, the offensive stalled, and Napoleon lost half a million soldiers, along with most of his cavalry and artillery.

History is full of smart, powerful people doing unreasonable things up to and even past the brink of disaster. Beautiful-sounding ideas can be irresistible when they’re hard to test — or when you’ve found a way to convince yourself that you can ignore the test results in front of your eyes.

Feeling the ASI

To recap: People often fall into empty optimism about how easy a problem is going to be; people can acclimate to horrific risks; and people can become enamored with lovely-sounding but hopeless ideas, especially when they’re working in a young and immature field.

That’s more than enough to explain the reckless charge. But based on our experience, we would guess that it still isn’t the whole story.

Another plausible piece of the puzzle is that the engineers and CEOs don’t really quite believe what they’re saying. Not in a deep way. They might understand the arguments and be compelled in the abstract, but this isn’t the same thing as feeling the belief.

What people say out loud in public, and what they tell themselves in the privacy of their own thoughts, and what their brains really actually anticipate happening to them, can often come unglued. Those three different threads of belief don’t have to all agree.

Back in 2015, when some of the big movers in the present-day disaster were just getting started, we suspect that talented executives could get some attention — and a few tens of millions of dollars in funding — by saying that AI was a world-ending issue, to funders who maybe believed more sincerely that AI was maybe a world-ending issue.§

But, we suspect, many of the people saying those things didn’t really absorb and anticipate any particular detailed model of the world ending. They probably failed to viscerally imagine that they themselves might bring the world to ruin by pushing things forward or making a mistake. They didn’t imagine the sound of every human on the planet exhaling their last breath. They didn’t feel the feelings that would normally go with killing two billion children.

That kind of thing had never happened to them, and it had never happened to anyone they knew.

The world had not even seen ChatGPT, let alone a superintelligence. It wasn’t the sort of thing their friends and family and neighbors believed, not something they believed the way one believes in looking for traffic before crossing a street.

It was just an exciting-sounding story, too huge to properly grasp.

And yet it was also the sort of thing where saying it out loud could get you lots of money and respect.

As Yudkowsky (2006) notes:

In addition to standard biases, I have personally observed what look like harmful modes of thinking specific to existential risks. The Spanish flu of 1918 killed 25–50 million people. World War II killed 60 million people. 107 is the order of the largest catastrophes in humanity’s written history. Substantially larger numbers, such as 500 million deaths, and especially qualitatively different scenarios such as the extinction of the entire human species, seem to trigger a different mode of thinking — enter into a “separate magisterium.” People who would never dream of hurting a child hear of an existential risk, and say, “Well, maybe the human species doesn’t really deserve to survive.”

There is a saying in heuristics and biases that people do not evaluate events, but descriptions of events — what is called non-extensional reasoning. The extension of humanity’s extinction includes the death of yourself, of your friends, of your family, of your loved ones, of your city, of your country, of your political fellows. Yet people who would take great offense at a proposal to wipe the country of Britain from the map, to kill every member of the Democratic Party in the U.S., to turn the city of Paris to glass — who would feel still greater horror on hearing the doctor say that their child had cancer — these people will discuss the extinction of humanity with perfect calm.

What could somebody actually be thinking when they say — before starting what would become the world’s foremost AI company — “AI will probably most likely lead to the end of the world, but in the meantime, there will be great companies”? Are they really, actually thinking about their friends being dead, their friends’ kids being dead, they themselves being dead, all of human history and all the museums turning to dust? Are they thinking of that really happening — all of it as mundane and tragic as a relative they actually saw die of cancer, except that it’s happening to everyone?

We suspect not.

To us, it seems like that’s not the most plausible guess at the internal psychological state of someone emitting such a sentence.

There’s what Bryan Caplan termed a “missing mood” in it. There’s no grieving. There’s no horror. There’s no desperate drive to do something about it, in the statement that AI will most likely lead to the end of the world, but in the meantime, there will be great companies.

For at least some of these CEOs and researchers, our guess is more like: They’ve heard a bunch of arguments about ASI maybe posing some danger, and they worry they’d look stupid in front of at least some of their friends if they blew that off entirely. If they say instead that AI will end the world, they’ll be seen as treating AI as dangerous and a big deal, and therefore sound visionary in certain circles. By adding a quip about “In the meantime, there will be great companies,” they get to send a message about how hip and unworried they are in the face of danger.

It’s not the sort of thing you say if you’re hearing the words coming out of your mouth, and believing them.

What Kind of Person Does It Take?

Another part of the story, perhaps, is that the people running the leading AI labs are the kinds of people who were able to convince themselves that building a superintelligence would be okay, despite (in almost all cases) having seen the arguments that this is lethal. (We know because we spoke to many of them beforehand.)

To understand why somebody chooses an option, it also helps to understand what their alternatives were — to understand what menu of options they were choosing from.

What happened if somebody in 2015 actually believed, and then said publicly, that they legitimately expected ASI to destroy the world? What if, instead of “but in the meantime, there will be great companies,” the heads of AI labs were the sort to break the mood and say, “and that is wildly unacceptable”?

We can tell you, because we tried out that approach ourselves. The answer is that they would be met with rather a dearth of sympathy.

Nobody in 2015 had seen ChatGPT. Nobody had seen the computers actually start talking and (to all appearances) start thinking. It was all hypothetical and dismissible.

These days, superintelligence and the threat of near-term extinction are mainstream topics, at least in tech circles. But back in 2015, if you talked about this seriously, people responded with the sort of puzzled look that many humans fear worse than death.

There were people who worried, even in 2015, that aligning superintelligence might actually be difficult, in the way that rocket launches are difficult. None of them founded OpenAI.

In recent days, with the emergence of ChatGPT and other LLMs, some people — including parents with children who want those children to live to see adulthood — have asked engineers at these AI companies why they are doing this. And those AI researchers thought quickly, and responded, “Oh, because — because if we don’t do it, China will do it first! And that will be even worse!”

But that isn’t what they said when OpenAI began. And it makes little sense in terms of the posture that China has actually taken publicly, as of mid-2025. You would think that if someone genuinely believed that both of these outcomes were likely to be horrible for the world, they would at least raise the topic of drafting an international treaty to see if there was any other way, or of finding some other way to prevent the national security threat that didn’t involve a soicide race.

But the “China” rejoinder has the right feel. It gets the vibes right. It’s the sort of reason that might plausibly justify what they’re doing, separate from whether it’s their actual motivation or the thing that originally caused them to enter this field.

(Or so we guess.)

The people who genuinely understood superintelligence and the threat it poses simply didn’t start AI companies. The people who did are those who found some way to convince themselves that everything would be fine.

Normal Humans, Unusual Tech

We have spelled out the plausible psychology as we see it. But frankly, it doesn’t seem like all of these explanations are necessary.

How could people possibly do a self-destructive thing that is hugely profitable in the short term, that brings them tremendous status and attention and acclaim, that comes with the promise of untold riches and power, but which will eventually hurt them for obscure and complicated reasons they could easily find some excuse not to believe? That is a historically strange question. Behavior like that shows up all the time in history books.

At the end of the day, it doesn’t matter how the AI executives or researchers excuse their actions, and it isn’t necessary to understand which exact twists and turns each of them took to arrive at their current beliefs. It is not extraordinary for people with wealth or ambition to engage in reckless pursuits, and it is not extraordinary for subordinates to follow orders. The harms are hidden in the future, which feels abstract and easy to ignore.

This is all normal human behavior. If it carries on this way, it’ll end in the way these things often do, but with no one left behind this time to learn and try again.


* People like Sutton and Page seem to be operating under the illusion that greater intelligence leads to greater goodness, which we argued elsewhere is not the case. And while we authors happen to agree with Sutton and Page that it would be a tragedy to never build smarter-than-human AI, we think that racing to build superintelligence is likely to be completely catastrophic both for human life and for the long-term future more broadly, even from an inclusive, cosmopolitan, non-speciesist perspective.

 It wouldn’t be the first time a field acclimated to needlessly high risks. Anesthesiologists in the 1980s reduced their death rates by a factor of one hundred by adopting a simple set of monitoring standards.

Anesthesiologists appear to have spent decades causing hundreds of times as many deaths as they needed to, for literally no reason other than that they were thinking of their death rate as already low (by comparing it to, e.g., rates of surgical complications). They didn’t realize they should be trying to shoot for a lower rate, as Hyman and Silver report:

By the 1950s, death rates ranged between 1 and 10 per 10,000 encounters. Anesthesia mortality stabilized at this rate for more than two decades.

[…W]e should consider why anesthesia mortality stabilized at a rate more than one hundred times higher than its current level for more than two decades. The problem was not lack of information. To the contrary, anesthesia safety was studied extensively during the period. A better hypothesis is that anesthetists grew accustomed to a mortality rate that was exemplary by health care standards, but that was still higher than it should have been. From a psychological perspective, this low frequency encouraged anesthetists to treat each bad outcome as a tragic but unforeseen and unpreventable event. Indeed, anesthetists likely viewed each individual bad outcome as the manifestation of an irreducible baseline rate of medical mishap.

 Structural engineers base their risk estimates on precise calculations and measurements, whereas “p(doom)” numbers are based mostly on AI researchers’ intuition. But this doesn’t inspire greater confidence in AI researchers’ engineering practices. If anything, it makes the situation worse.

A less robust, more subjective estimate can systematically err in the direction of “too pessimistic,” but it can also err in the direction of “too optimistic.” The fact that these numbers are less reliable doesn’t establish them as specifically biased toward pessimism. The fact that AI researchers can’t ground their risk estimates in anything more than hunches and qualitative arguments, even as they manage to grow smarter and smarter AIs year over year, is a further reason to be concerned.

The fact that AI researchers’ estimates are genuinely terrifying and completely unprecedented in any technical discipline doesn’t establish that they’re wrong in the direction we would like them to be wrong. Racing to build vastly smarter-than-human autonomous agents sounds like the kind of endeavor that is likely to have far greater than a 50 percent chance of causing a catastrophe. Before we even dive into the details, this sounds like the kind of project that is very likely to go wrong in one way or another, and the kind where going wrong is liable to have enormous consequences. And the details, as we’ve argued in Chapters 4, 5, and throughout the book, paint a grimmer picture than even this first-pass look would suggest.

§ See also our discussion of the people who were warning about an AI race to the bottom years before these companies formed.

Notes

[1] scope neglect: See, e.g., Kahneman et al.’s paper “Economic Preferences or Attitude Expressions?: An Analysis of Dollar Responses to Public Issues.”

[2] the planning fallacy: See, e.g., Kahneman and Tversky’s Intuitive Prediction: Biases and Corrective Procedures.

Your question not answered here?Submit a Question.