Planet Money

Don't hate the replicator, hate the game

February 27, 2026

Key Takeaways Copied to clipboard!

  • Economist Abel Brodeur created the 'Replication Games,' a crowdsourced surveillance system designed to combat the 'replication crisis' in social sciences by incentivizing researchers to ensure their findings are reproducible. 
  • The incentive structure in academia, which prioritizes statistically significant and novel results (often leading to p-hacking), is identified as the root cause of the replication crisis. 
  • The effectiveness of the Replication Games relies not on severe punishment for errors, but on increasing the perceived odds of enforcement, thereby changing researcher behavior and improving data cleanliness. 

Segments

Introduction to Replication Crisis
Copied to clipboard!
(00:00:20)
  • Key Takeaway: The ‘replication crisis’ involves scientific studies failing to yield the same results upon re-running old studies or re-enacting experiments.
  • Summary: Hosts Mary Childs and Alexi Horowitz-Ghazi introduce their mission to Montreal to meet Abel Brodeur. The core issue discussed is the replication crisis, where re-running established scientific research often fails to reproduce the original findings. This crisis impacts fields from psychology to economics.
Abel Brodeur and Replication Games
Copied to clipboard!
(00:00:57)
  • Key Takeaway: Abel Brodeur invented the ‘Replication Games,’ a hackathon-style event where teams attempt to reproduce recently published social science papers to check result validity.
  • Summary: Abel Brodeur, an economics professor, hosts the Replication Games to change norms through monitoring. Participants work all day to reproduce findings, aiming to keep social scientists honest. Brodeur believes that even a small chance of monitoring can significantly alter research behavior.
Origin of P-Hacking Concern
Copied to clipboard!
(00:05:26)
  • Key Takeaway: Brodeur’s personal experience in 2011 showed that academic incentives push researchers to contort data analysis until a statistically significant result (less than 5% chance of being random) is found.
  • Summary: While researching smoking bans in 2011, Brodeur found no effect but tinkered with his data until he achieved a significant result, which he ultimately discarded as ‘dumb.’ This experience revealed that the pressure to publish significant findings leads to data manipulation like p-hacking. His subsequent research analyzing significance thresholds confirmed a suspicious hump just above the 5% cutoff.
Developing the Replication Institute
Copied to clipboard!
(00:11:13)
  • Key Takeaway: To scale the solution, Brodeur established the Institute for Replication, using a ‘Potemkin’ website and famous economists to legitimize requests for data and code.
  • Summary: Brodeur sought a large-scale solution to change research norms, recognizing that small-scale efforts would be ignored. He created the Institute for Replication to appear official, enabling him to solicit data and code from researchers who previously ignored his requests. This institutional facade was necessary to gain traction for his auditing project.
The First Replication Game in Oslo
Copied to clipboard!
(00:14:46)
  • Key Takeaway: The first Replication Game in Oslo unexpectedly drew 80 participants, leading to the discovery of a major data error in one paper involving duplicated entries.
  • Summary: An invitation for a small workshop in Oslo accidentally attracted 80 researchers, forcing Brodeur to organize the first Replication Game. During this event, one team discovered that a paper on inequality rested on improperly merged data, resulting in identical entries for many subjects. While initially alarming, most other papers passed the initial replication checks.
Replication Phases and Team Work
Copied to clipboard!
(00:19:02)
  • Key Takeaway: Replication involves two phases: Phase One is objective code checking for exact reproduction, while Phase Two involves subjective robustness checks on authorial decisions.
  • Summary: The day-long hackathon requires teams to check code and data within seven hours. Phase One demands that replicators perfectly reproduce the published tables by running the original code. If successful, Phase Two involves robustness checks, analyzing the negative space around the paper’s conclusions to see if alternative, reasonable choices would alter the findings.
Findings from Montreal Teams
Copied to clipboard!
(00:26:49)
  • Key Takeaway: Teams uncovered issues including missing variables in a government trust paper and results collapsing when one cartel was excluded from a paper on Mexican cartels.
  • Summary: One team found that the government trust paper used variables in its regression that were not included in the provided raw data set. Another team discovered that removing a single cartel caused the significance of their paper on cartel behavior to disappear entirely. The authors of the cartel paper later argued this robustness check misunderstood their focus on a specific cartel, Los Zetas.
Impact and Future of Replication
Copied to clipboard!
(00:34:33)
  • Key Takeaway: Brodeur views finding errors in published work as a failure of the existing peer review system, but the games are changing researcher behavior by increasing the perceived odds of being caught.
  • Summary: Brodeur considers it a failure that errors were not caught before publication by journal referees, noting the rate of failure is higher than commonly assumed. However, participants reported that the experience will change how they conduct future research. The goal is to shift norms by demonstrating that the probability of enforcement, not just the severity of punishment, drives compliance.