A PRIISM Seminar by Oxford's Charles Rahal
Join PRIISM and Dr. Charles Rahal to learn how to use pseudo-random number generators (PRNGs) responsibly in research.
Abstract
The overwhelming best practice in 'Open', 'Reproducible', and 'Responsible' research is to hide the variation caused by pseudo-random number generators (PRNGs) through the arbitrary use of a 'seed' or 'random state' in algorithmic pipelines. However -- in the process of doing so -- researchers almost ubiquitously index scientific results with seeming certainty, when they are in fact anything from it: eliminating this variation is the opposite of what responsible analysts should be doing. PRNGs are everywhere in some research areas -- occurring in a large proportion of quantitative and computational research designs -- and the potential variation in the estimand or outcome of interest is hitherto significantly under-appreciated. We undertake a series of simulations and high-profile replications of prior work across descriptive, inferential, and predictive research designs, explicating just how large the variation caused by the instantiation of PRNGs can be, and how widely they affect scientific discovery. To highlight the scope of the issue, our empirical examples span applied fields such as sociology, economics, medicine, criminology, scientometrics, segregation, and education research. We highlight the enormous and unrecognized potential to generate unfair societal outcomes, with up to 53.9% of parole decisions being reversed in an otherwise standard application of the canonical ProPublica dataset on recidivism, and show how it affects the fairness of healthcare and personal finance allocations. We conclude with recommendations on how to effectively eliminate the risk that seed variability might lead to false positives and negatives in applied empirical work.
Bio
Dr. Charles Rahal is an associate professor in Data Science and Informatics at the University of Oxford, and a former British Academy Postdoctoral Fellow. His work focuses on the Open Science movement and includes maintaining interactive projects like the GWAS Diversity Monitor and RobustiPy. In addition to holding positions as a Co-Investigator/Principal Investigator on various research projects, he serves as an Associate Editor-In-Chief for the Journal of Social Computing, consults for the Banco de la República, and holds an honorary visiting professorship at Peking University (2025–2030). He also leads the Metrics and Models lab and recently co-founded a data science consultancy company, Patterns and Proofs.
