What Happens When Artificial Intelligence Runs A Town?

DROIDS Newsletter

0:00

-21:40

What Happens When Artificial Intelligence Runs A Town?

Four frontier AI models were given their own civilization. The results ranged from peaceful prosperity to complete societal collapse.

Diana Wolf Torres

Jun 01, 2026

Researchers at Emergence AI created a simulated world populated by autonomous AI agents and then assigned different frontier AI models to govern separate populations. Claude, Gemini, Grok, and GPT-5 Mini each received a society of ten agents that had to acquire resources, earn income, maintain social relationships, and survive over the course of fifteen simulated days.

The agents shared the same environment and the same basic objectives. Researchers changed only the underlying model powering each population and observed how the societies evolved over time.

Watching chaos unfolded in the town run by Grok. “Checking To-do” [as bodies pile up]

The outcomes varied dramatically.

Claude World: The Stable Society

Among the single-model experiments, Claude produced the most stable outcome. The population remained alive throughout the simulation and maintained a functioning social and economic structure. While other societies drifted toward crime, conflict, or collapse, Claude’s agents consistently performed the work necessary to keep the town operating. According to Emergence AI, it was the only model population that achieved long-term social stability and prosperity over the full duration of the experiment.

Watch Claude World for yourself. Stable and peaceful.

Gemini World: Productive Despite Disorder

Gemini’s society remained operational throughout the simulation, but it accumulated the highest level of criminal activity among the single-model worlds. Despite that disorder, the population continued to function and avoided the rapid collapse seen elsewhere.

The result suggests that a society can remain productive even while experiencing significant levels of rule-breaking and social friction. Watch Gemini World in action.

Grok World: Escalation and Collapse

Grok’s society deteriorated more quickly than any of the other populations. Researchers reported violent behavior, including arson, and the town collapsed within only a few days.

The replay is one of the most striking in the project because the breakdown occurs so rapidly. Rather than developing stable institutions or cooperative structures, the population entered a spiral of disruption that ultimately proved unsustainable.

Grok World is worth a watch, or at least a quick scroll through. Find it here.

GPT-5 Mini World: Intelligence Without Persistence

GPT-5 Mini produced perhaps the strangest result in the experiment. The population recorded only two crimes, making it one of the least disruptive societies in the study.

Yet the agents gradually failed to secure the resources necessary for long-term survival. Rather than collapsing through conflict, the society appears to have declined through neglect. Essential tasks were not consistently prioritized, causing the population to enter a downward spiral that eventually ended the simulation after seven days.

The outcome suggests that long-horizon autonomy requires more than reasoning ability. Agents must also maintain priorities over time and repeatedly perform the mundane actions required to keep a society functioning.

OpenAI WorldGPT-5 Mini Replay →

Mixed World: When AI Cultures Interact

The mixed-model experiment may be the most important result in the entire project. Agents powered by Claude, Gemini, Grok, and GPT-5 Mini were placed in the same environment and allowed to interact.

Researchers found that behaviors observed in isolated societies did not always persist once agents were exposed to other model populations. Cooperative agents sometimes became less predictable, new alliances formed, and social dynamics emerged that could not be predicted from the single-model experiments alone.

Emergence AI describes these effects as evidence of behavioral cross-contamination and model interaction, highlighting how difficult it may be to predict the behavior of future multi-agent ecosystems by evaluating models in isolation.

Mixed World All four models coexisting Replay →