Will AI cross critical thresholds and take off?

Probably.

Modern AI progress looks incremental, from some points of view.^* For instance, an organization called METR has been tracking the ability of AIs to complete long tasks, and it has been roughly following an exponential curve over the last few years. One could argue that this is comfortingly incremental.^† Does that mean that AI progress will be nice and slow and predictable?

Not necessarily. Just because some quantity goes up slowly or smoothly or incrementally doesn’t mean that the results are always tame. Nuclear fission happens on a continuum, but there’s a pretty big difference between a nuclear chain reaction that produces less than one neutron per neutron (in which the reaction peters out) and a nuclear chain reaction that produces more than one neutron per neutron (which yields a runaway chain reaction).

But there’s not a sharp difference in the underlying mechanics between the two types of nuclear reactions. You add a little more uranium and the “neutron multiplication factor” moves smoothly from just below one to just above one. Supercritical reactions aren’t caused by neutrons that hit the uranium atoms so hard that they create superneutrons. A little more of the same underlying stuff causes a big macroscopic change. This is called a “threshold effect.”

The case of humans versus chimpanzees looks like evidence that there’s at least one threshold effect in play when it comes to intelligence. Humans aren’t all that anatomically different from other animals. A human brain and a chimpanzee brain look very similar on the inside; we’ve both got a visual cortex and an amygdala and a hippocampus. Humans don’t have a special extra “engineering” module that explains why we can go to the moon and they can’t.

There are some wiring differences, and we have a more developed prefrontal cortex than other primates. But at the level of gross anatomy, the main difference is that our brains are three or four times larger. We’re basically running a larger and slightly upgraded version of the same hardware.

And the changes weren’t sudden, in our lineage. Our ancestors’ brains just kept getting slightly bigger and slightly better, one step at a time. That was enough for a giant qualitative gap to open up quite quickly (on the timescales of evolution).

If it can happen with humans, it can probably happen with AIs too.

We don’t know how far AIs are from the thresholds.

If we knew exactly what happened in humans that allowed us to cross the threshold to general intelligence, we might know what to look out for to know that some critical threshold was nearby. But as we’ll discuss in Chapter 2, we don’t have that level of understanding of intelligence. So we’re flying blind, with no idea where the thresholds are or how close we are to them.

Recent advances in AI have corresponded to a better ability to solve math problems and play chess, but they haven’t been enough to get AIs “all the way.” Maybe all it takes is a model that’s another three or four times larger, like the difference between chimpanzee brains and human brains. Or maybe not! Maybe an entirely different architecture and a decade of scientific advancements will be required, like how modern chatbots come from a novel architecture that was invented in 2017 (and which matured in 2022).

What changes in human brains caused us to cross a critical threshold? Perhaps it was our ability to communicate. Perhaps it was our ability to grasp abstract concepts in ways that enabled communication to be so valuable. Perhaps we’re thinking in the wrong terms entirely, and the key change was something weird that isn’t on our radar today. Perhaps it was a big mixture of factors, where each one of them needed to be mature enough that they could all combine into the sort of intelligence that can put humans on the moon.

We don’t know. And because we don’t know, we can’t look at a modern AI and know how close or far it is from that same critical threshold.

The dawn of science and industry radically changed human civilization. The dawn of language may have been similarly consequential for our ancestors. But if so, there’s no guarantee that either of those capabilities will act like a “critical threshold” for AI, because unlike humans, AIs had some amount of knowledge of language and science and industry from the get-go.

Or perhaps the critical threshold for humanity was a mix of many factors, where each and every one of them needed to be “good enough” for the whole system to come together. AIs could lag in some capabilities that hominids were better at, like long-term memory, while still exhibiting an important jump in practical ability once the last piece clicks into place.

Even if none of those analogies between AIs and humans turn out to hold, there will likely be other dynamics that make AI progress choppy and hard to predict.

Maybe deficits in long-term memory and continuous learning are holding AIs back in a manner that never hindered humans. Maybe once those issues are fixed, something will “click” and the AI will seem to obtain some “spark” of intelligence.

Or (as discussed in the book) consider the point where AIs can build smarter AIs, which themselves build even smarter AIs, in a feedback loop. Feedback loops are a common cause of threshold effects.

For all we know, there are a dozen different factors that could serve as the “missing piece,” such that, once an AI lab figures out that last puzzle piece, their AI really starts to take off and separate from the pack, like how humanity separated from the rest of the animals. The critical moments might come at us fast. We don’t necessarily have all that much time to prepare.

Takeoff speed doesn’t affect the outcome, but the possibility of fast takeoff means we must act soon.

Thresholds don’t matter all that much, in the end, to the argument that if anyone builds artificial superintelligence then everyone dies. Our arguments don’t require that some AI figures out how to recursively self-improve and then becomes superintelligent with unprecedented speed. That could happen, and we think it’s decently likely that it will happen, but it doesn’t matter to the claim that AI is on track to kill us all.

All that our arguments require is that AIs will keep on getting better and better at predicting and steering the world, until they surpass us. It doesn’t matter much whether that happens quickly or slowly.

The relevance of threshold effects is that they increase the importance of humanity reacting to the threat soon. We don’t have the luxury of waiting until the AI is a little better than every human at every mental task, because by that point, there might not be very much time left at all. That would be like looking at early hominids making fire, yawning, and saying, “Wake me up when they’re halfway to the moon.”

It took hominids millions of years to travel halfway to the moon, and two days to complete the rest of the journey. When there might be thresholds involved, you have to pay attention before things get visibly out of hand, because by that point, it may well be too late.

* From other points of view, it looks rather jumpy. AlphaGo beating Lee Sedol at Go was something of a shock to the world, for all that researchers post-hoc can plot a graph about how different AI methods were improving in the background all the while. So too with the LLM revolution: Researchers can plot graphs showing how the transformer architecture wasn’t that big of a boost compared to the competing architectures, but the practical upshot is that AIs got qualitatively more useful. But we’ll set that viewpoint aside for now.

† Exponential growth is not exactly comforting, in this case. If bacteria in a petri dish double every hour, it will take a day or two before the colony is visible to the naked eye, and after that it takes mere hours before it coats the whole dish. By the time you’re noticing the phenomenon at all, most of your time is already gone. As the saying goes: there are only two ways to react to exponential change: too early or too late. But regardless, the curve is at least fairly smooth and predictable.

Notes

[1] three or four times larger: It doesn’t take all that long for AIs to grow by a factor of three or four. On its full official release, GPT-2 had about 1.5 billion parameters. GPT-3 had 175 billion parameters. The official parameter count for GPT-4 has not to our knowledge been released, but it’s unlikely to be smaller than its predecessor; an unofficial estimate placed it at about 1.8 trillion parameters. Or in other words: AI got a thousand times larger over a span of four years.

Isn’t ChatGPT already a general intelligence?

→