AI with guilt could enhance cooperation study finds
Research suggests programming AI with guilt could make them more cooperative similar to human behavior according to game theory simulations
Some sci-fi scenarios depict robots as cold-hearted clankers eager to manipulate human stooges. But that’s not the only possible path for artificial intelligence.
Humans have evolved emotions like anger, sadness, and gratitude to help us think, interact, and build mutual trust. Advanced AI could do the same. In populations of simple software agents, having "guilt" can be a stable strategy that benefits them and increases cooperation, researchers report July 30 in Journal of the Royal Society Interface.
How Guilt Works in AI Agents
Emotions are not just subjective feelings but bundles of cognitive biases, physiological responses, and behavioral tendencies. When we harm someone, we often feel compelled to pay a penance, perhaps as a signal to others that we won’t offend again. This drive for self-punishment can be called guilt, and it’s how the researchers programmed it into their agents. The question was whether those that had it would be outcompeted by those that didn’t, say Theodor Cimpeanu, a computer scientist at the University of Stirling in Scotland, and colleagues.
The agents played a two-player game with their neighbors called iterated prisoner’s dilemma. The game has roots in game theory, a mathematical framework for analyzing multiple decision makers’ choices based on their preferences and individual strategies. On each turn, each player "cooperates" (plays nice) or "defects" (acts selfishly). In the short term, you win the most points by defecting, but that tends to make your partner start defecting, so everyone is better off cooperating in the long run. The AI agents couldn’t feel guilt as richly as humans do but experienced it as a self-imposed penalty that nudges them to cooperate after selfish behavior.
Key Findings
The researchers ran several simulations with different settings and social network structures. In each, the 900 players were each assigned one of six strategies defining their tendency to defect and to feel and respond to guilt. In one strategy, nicknamed DGCS for technical reasons, the agent felt guilt after defecting, meaning that it gave up points until it cooperated again. Critically, the AI agent felt guilt (lost points) only if it received information that its partner was also paying a guilt price after defecting. This prevented the agent from being a patsy, thus enforcing cooperation in others. (In the real world, seeing guilt in others can be tricky, but costly apologies are a good sign.)
Related News
AI Scientists Could Win Nobel Prizes by 2050
Researchers debate whether autonomous AI scientists could achieve Nobel-worthy breakthroughs by mid-century, with some predicting success as early as 2030.
OpenAI CEO discusses AI ambitions with UAE president
OpenAI's CEO met with UAE's president to discuss AI cooperation and the country's goals in artificial intelligence research and applications
About the Author

Alex Thompson
AI Technology Editor
Senior technology editor specializing in AI and machine learning content creation for 8 years. Former technical editor at AI Magazine, now provides technical documentation and content strategy services for multiple AI companies. Excels at transforming complex AI technical concepts into accessible content.