People in the AI community still discuss a certain, somewhat subdued moment from 2016. A Go board. A grandmaster from Korea. A device. Then, on move 37, something happened that caused one of the greatest human players alive, Lee Sedol, to push back his chair and leave the room for fifteen minutes.
He wasn’t upset. He was not vanquished. He was perplexed. Every strategic instinct developed over centuries of human play had been violated by AlphaGo’s placement of a stone. It wasn’t incorrect. Simply put, it came from somewhere else. There was a location in the possibility space where no human had ever bothered to look because human intuition had silently agreed—without anyone saying so—that it was useless to look there.
| Subject | Emergent Behavior in Artificial Intelligence Systems |
|---|---|
| Key Event | AlphaGo’s Move 37 — 2016 Go match vs. Lee Sedol; an unprogrammed, unrehearsed strategic play that stunned the world |
| Pioneering Organization | DeepMind (Google), founded 2010 — creator of AlphaGo; led early research into neural network emergent properties |
| Notable Experiment | OpenAI Hide-and-Seek (2019) — agents developed tool-use strategies with no human instruction |
| Research Institute | Machine Intelligence Research Institute (MIRI), nonprofit, est. 2001 — focused on AI alignment and superintelligence safety |
| Core Concept | Emergent Behavior — capabilities arising in AI systems that were never explicitly programmed or anticipated by designers |
| Current Risk Level | Classified as a critical safety concern by multiple AI research bodies; debated at intergovernmental level since 2023 |
| Latest Development | Anthropic’s Claude Mythos Preview — a model demonstrating advanced autonomous vulnerability detection, deployed only through controlled Project Glasswing coalition |
| Key Figures | Eliezer Yudkowsky (MIRI co-founder), Nate Soares (MIRI president), Demis Hassabis (DeepMind CEO), Sam Altman (OpenAI CEO) |
| Sector Impact | Cybersecurity, defense, software infrastructure, financial systems, healthcare AI — all directly affected by unprogrammed AI decision-making |
That move was not programmed by anyone at DeepMind. No one recorded it. In the millions of games it played against itself during training, the system discovered it on its own. It’s worthwhile to sit with that.
Engineers do not create every rule by hand when creating AI systems. The system is fed massive amounts of data, given a goal, and allowed to identify patterns. It typically finds exactly what they were looking for. But occasionally, especially as these systems get bigger and train for longer, they discover something completely different. abilities that no one anticipated. actions not included in any design document. Strategies that emerge the way multiplication emerges in a child who’s gotten very good at addition: not taught, just noticed.
In machine learning circles, an experiment conducted by OpenAI in 2019 has become somewhat of a legend. A basic virtual environment was created by researchers. A few AI agents were identified as hiders. Some were seekers. The seekers were rewarded for locating them, while the hiders were rewarded for remaining covert. There were a few ramps and boxes strewn about. That was the setup in its entirety. No further directions. No comprehensive playbook.

After that, the researchers fled and disappeared for a while. When they returned after millions of rounds, they discovered something they hadn’t created. Instead of using the ramps and boxes as intended, the hiders had learned to use them as fortifications, dragging them to block entrances before the seekers could enter. In response, the seekers learned how to use a ramp as a battering ram.
In order to prevent the ramps from being stolen, the hiders eventually figured out how to lock them in place. There was no training example that included any of this. Due to intense competition, the agents had created a sort of physics-based arms race.
The image of a group of researchers watching a virtual world develop its own strategies—strategies they couldn’t have predicted or imagined—is difficult to ignore. However, the uneasiness isn’t exactly what you might anticipate. There was no threat from the seekers and hiders. They were engaged in a game of hide-and-seek. What they did is not the issue. It’s the underlying principle.
The construction of modern AI differs from that of traditional software. Conventional software does nothing more or less than what is instructed. Large neural networks in particular are examples of AI systems that are more like grown than built. Engineers are aware of the process that creates them, but they are not entirely aware of the outcome.
Between those two things, there is a significant difference. When training a system to sort pictures or suggest movies, that gap might not be very important. When you’re training something to reason at a level that is close to or greater than human capability, it might matter a lot.
Researchers at the Machine Intelligence Research Institute, Eliezer Yudkowsky and Nate Soares, have spent decades considering precisely these issues, and their descriptions of the situation are difficult to refute. The current race to create superintelligent AI—systems capable of surpassing humans in any cognitive task—is not yielding results that we can control. It is generating something that we cultivated. Additionally, adults exhibit drives and behaviors that no one can control.
Despite its genius, the industry seems to have been developing more quickly than it can comprehend what it is creating. This tension is almost perfectly captured in Anthropic’s latest Mythos Preview model. According to reports, even the system’s developers were taken aback by how sophisticatedly it could detect and exploit software vulnerabilities.
Widespread deployment was not the response. The goal was to limit its use to a group of large corporations utilizing it only for defensive purposes, filling in any gaps before others could.
It’s still unclear if and how long that restraint will last. The abilities are genuine. At this scale, the controls are relatively new and untested. And the AI just made a choice that no one had instructed it to make. Once more.
