AI Experts on Catastrophe Scenarios

In a 2022 survey of 738 attendees of the academic AI conferences NeurIPS and ICML, 48 percent of respondents thought there was at least a 10 percent chance that AI’s outcome will be “extremely bad (e.g., human extinction).” Concerns about AI causing an unprecedented disaster are widespread in this field.

Below, we’ve collected comments from prominent AI scientists and engineers on catastrophic AI outcomes. Some of these scientists give their “p(doom)” — i.e., their probability of AI causing human extinction or similarly disastrous outcomes.*

From Geoffrey Hinton (2024), recipient of a Nobel Prize and a Turing Award for sparking the deep learning revolution in AI, speaking on his personal estimates:

I actually think the risk [of the existential threat] is more than 50 percent.

From Yoshua Bengio (2023), Turing Award recipient (with Hinton and Yann LeCun) and the most cited living scientist:

We don’t know how much time we have before it gets really dangerous. What I’ve been saying now for a few weeks is “Please give me arguments, convince me that we shouldn’t worry, because I’ll be so much happier.” And it hasn’t happened yet. […] I got around, like, 20 percent probability that it turns out catastrophic.

From Ilya Sutskever (2023), co-inventor of AlexNet, former chief scientist at OpenAI, and (with Hinton and Bengio) one of the three most highly cited scientists in AI:

[T]he vast power of superintelligence could also be very dangerous, and could lead to the disempowerment of humanity or even human extinction. While superintelligence seems far off now, we believe it could arrive this decade. […]

Currently, we don’t have a solution for steering or controlling a potentially superintelligent AI, and preventing it from going rogue. Our current techniques for aligning AI, such as reinforcement learning from human feedback⁠, rely on humans’ ability to supervise AI. But humans won’t be able to reliably supervise AI systems much smarter than us, and so our current alignment techniques will not scale to superintelligence. We need new scientific and technical breakthroughs.

From Jan Leike (2023), alignment science co-lead at Anthropic and former co-lead of the superalignment team at OpenAI:

[interviewer: “I didn’t spend a lot of time trying to precisely pin down my personal p(doom). My guess is that it’s more than 10 percent and less than 90 percent.”]

[Leike:] That’s probably the range I would give too.

From Paul Christiano (2023), Head of Safety at the U.S. AI Safety Institute (based in NIST) and inventor of reinforcement learning from human feedback (RLHF):

Probability that most humans die within 10 years of building powerful AI (powerful enough to make human labor obsolete): 20% […]

Probability that humanity has somehow irreversibly messed up our future within 10 years of building powerful AI: 46%

From Stuart Russell (2025), Smith-Zadeh Chair in Engineering at UC Berkeley and co-author of the top undergraduate AI textbook, Artificial Intelligence: A Modern Approach:

The “AGI race” between companies and between nations is somewhat similar [to the Cold War race to build larger nuclear bombs], except worse: Even the CEOs who are engaging in the race have stated that whoever wins has a significant probability of causing human extinction in the process, because we have no idea how to control systems more intelligent than ourselves. In other words, the AGI race is a race towards the edge of a cliff.

From Victoria Krakovna (2023), research scientist at Google DeepMind and co-founder of the Future of Life Institute:

[interviewer: “This is not a very pleasant thing to think about, but what would you consider is the probability of Victoria Krakovna dying from AI before 2100?”]

[Krakovna:] I mean, 2100 is very far away, especially given how quickly the technology’s developing right now. I mean, off the top of my head, I would say like 20 percent or something.

From Shane Legg (2011), co-founder and chief AGI scientist at Google DeepMind:

[interviewer: “What probability do you assign to the possibility of negative/extremely negative consequences as a result of badly done AI? […] Where ‘negative’ = human extinction; ‘extremely negative’ = humans suffer”]

[Legg:] [W]ithin a year of something like human level AI[…] I don’t know. Maybe 5%, maybe 50%. I don’t think anybody has a good estimate of this. If by suffering you mean prolonged suffering, then I think this is quite unlikely. If a super intelligent machine (or any kind of super intelligent agent) decided to get rid of us, I think it would do so pretty efficiently.

From Emad Mostaque (2024), founder of Stability AI, the company behind Stable Diffusion:

My P(doom) is 50%. Given an undefined time period the probability of systems that are more capable than humans and likely end up running all our critical infrastructure wiping us all out is a coin toss, especially given the approach we are taking right now.

From Daniel Kokotajlo (2023), AI governance specialist, OpenAI whistleblower, and executive director of the AI Futures Project:

I think AI doom is 70% likely and I think people who think it is less than, say, 20% are being very unreasonable[.]

From Dan Hendrycks (2023), machine learning researcher and director of the Center for AI Safety:

[M]y p(doom) > 80%, but it has been lower in the past. Two years ago it was ~20%.

All of the above researchers signed the Statement on AI Risk we opened the book with, which says:

Mitigating the risk of extinction from AI should be a global priority alongside other societal-scale risks such as pandemics and nuclear war.

Other prominent researchers who signed the statement included: ChatGPT architect John Schulman; former Google director of research Peter Norvig; Microsoft chief scientific officer Eric Horvitz; AlphaGo research lead David Silver; AutoML pioneer Frank Hutter; reinforcement learning pioneer Andrew Barto; GANs inventor Ian Goodfellow; former Baidu president Ya-Qin Zhang; public-key cryptography inventor Martin Hellman; and Vision Transformer research lead Alexey Dosovitskiy. The list goes on, with further signatories including: Dawn Song, Jascha Sohl-Dickstein, David McAllester, Chris Olah, Been Kim, Philip Torr, and hundreds of others.

* We have concerns with the practice of trying to assign a “p(doom).” Assigning a single probability — as opposed to multiple probabilities that each assume a different response society could choose — strikes us as defeatist. There’s a world of difference between somebody who has high p(doom) because they think the world mostly can’t prevent catastrophe, versus somebody who has high p(doom) because they think the world can prevent catastrophe but won’t.

If it turns out that most people have a high p(doom) for the latter reason, but everyone assumes it’s for the former reason, then people’s statements of high p(doom) could serve as a self-fulfilling prophecy, putting us on track for a disaster that was completely preventable.

We also have the impression that many people in Silicon Valley trade “p(doom)” numbers a bit like baseball cards, in a way that often seems divorced from reality. If you’re paying attention, then even a probability as low as 5 percent of killing every human being on the planet should be an obvious cause for extreme alarm. It’s far beyond the threat level you would need to justify shutting down the entire field of AI immediately. People seem to lose sight of this reality surprisingly quickly, once they get into the habit of ghoulishly trading p(doom) numbers at parties as though the numbers were a fun science fiction story and not a statement about what’s actually going to happen to all of us.

This isn’t to say that people’s p(doom) numbers are anywhere close to reality. But at the very least, you should read these numbers as experts throughout the field reporting that we’re facing a genuine emergency.

 Contrary to what Hinton says earlier in the video, Yudkowsky’s confidence regarding the dangers is not “99.999” percent; five nines would constitute an insane degree of confidence.

Your question not answered here?Submit a Question.