In the spring of 2023, Geoffrey Hinton used a somewhat awkward remote setup that had become commonplace since the pandemic to address a crowd at MIT’s EmTech Digital conference while seated in his home. He was 75 years old, had just left Google, and was making remarks that caused the audience to fall silent. Regarding artificial intelligence systems, he stated, “It may keep us around for a while to keep the power stations running.” “But after that, maybe not.”
Hinton had spent more than fifty years developing the mathematical underpinnings of contemporary artificial intelligence. networks of neurons. deep learning. The structures that eventually evolved into GPT, Gemini, Claude, and all other well-known large language models. It’s important to pay attention when someone with that background claims he doesn’t know how to stop what he helped create and that he isn’t sure there is a solution. Not because doom is inevitable—it isn’t—but rather because the particular issue he and other theoretical computer scientists have brought up keeps becoming more pertinent as these systems get bigger.
| Field | Details |
|---|---|
| Topic | Theoretical computer scientists warning that AI development has outpaced scientific understanding |
| Key Figures | Geoffrey Hinton (“Godfather of AI,” former Google, Turing Prize 2019), Yoshua Bengio (U. of Montreal, Turing Prize 2019), Yann LeCun (Meta) |
| Central Warning | Engineering capacity is outpacing scientific understanding of how AI systems actually work |
| Core Problem 1 | “Black Box” opacity — researchers cannot explain how models reach specific conclusions |
| Core Problem 2 | Statistical pattern matching vs. genuine understanding — LLMs simulate reasoning without comprehension |
| Core Problem 3 | “Clever Hans Effect” — AI reaches correct conclusions for incorrect reasons |
| Core Problem 4 | Hallucinations, emergent behaviors, and inability to predict model behavior at scale |
| Safety Assessment | Future of Life Institute (2025): No major AI firm scored above a “D” for existential safety planning |
| Regulatory Context | EU AI Act tightening; White House framework proposed; state-level patchwork laws in the U.S. |
| Hinton Quote | “I wish I had a nice simple solution I could push, but I don’t. I’m not sure there is a solution.” |
| Key Concern | AI agents increasingly controlling real-world infrastructure with behavior that cannot be reliably predicted |

When the apocalyptic framing that frequently permeates public coverage is removed, the main concern is that engineering has surpassed science. Without a solid, verifiable theory of how those systems actually function, the AI industry is developing and implementing systems at a remarkable scale. The models generate results. Occasionally, the results are exceptionally good. Researchers are able to see what enters and exits the system, but the billions of parameters that exist in between—the true mechanism of whatever the system is doing—remain genuinely opaque. Not a little ambiguous. Truly, obstinately opaque.
This is significant not only in theory but also in ways that are already manifesting in practice. A system that had learned to identify disease markers in hospital scans was discovered by medical AI researchers. By the numbers, it’s accurate. However, when they looked into its decision-making process, they discovered that it was partially dependent on the particular font that the hospital’s label printer was using, rather than the biological tissue patterns that signify illness. The training data contained a spurious correlation that the model had taken hold of and incorporated into its decision-making. Until someone took a close look, nobody knew. The Clever Hans effect, named for the well-known German horse that seemed to solve math problems but was actually reading subtle cues from its handler, is the name given to this phenomenon in the research community. Similar things are frequently done by AI, which determines the correct response for reasons unrelated to comprehending the question.
A similar but somewhat different argument has been made by Yoshua Bengio, who shared the 2019 Turing Prize with Hinton and is perhaps the more composed of the two. He is more concerned about how these systems can be used as weapons in the near future, such as disinformation campaigns, cyberattacks, and mass manipulation, than he is about existential catastrophe in the far future. “You can have a conversation with these systems and think that you’re interacting with a human,” Bengio stated in 2023. “They’re difficult to spot.” The systems have significantly improved since then. Instead of getting easier, the issue he described has gotten more difficult.
The regulatory response to this debate in 2026 has been remarkably reactive. The EU AI Act, which advocates for “white box” or interpretable models—systems whose decision-making can actually be audited—represents the most significant attempt to impose structural requirements on AI development. In 2025, the Future of Life Institute evaluated top AI developers and discovered that none of them received a higher score than a D on existential safety planning. For a sector integrating technology into critical infrastructure, financial systems, and healthcare, that is not a comforting grade.
Observing this particular debate over the past few years has given me the impression that those who are most interested in moving quickly frequently write off those who voice these concerns as alarmists. There is some merit to the counterargument, which holds that worrying about systems that are not yet in place diverts attention from real, immediate risks. However, there is no conflict between the two issues. Before implementing a trillion-parameter model in a hospital, you can still be concerned about algorithmic bias in hiring systems. They don’t cancel each other out. Unusually consistently, the scientists who developed these instruments claim that the industry is outpacing its own comprehension. That is not a specialized academic issue. When something goes wrong, it’s the kind of thing that usually matters a lot.
