Using a public data set of Major League Baseball salaries and on-field statistics, this paper runs regressions for all possible combinations of the selected control variables to generate statistically significant but spurious results, a practice known as “p-hacking.” This overt, deliberate, and systematic p-hacking leads to many counterintuitive results that can help students think carefully about variable selection, causality, and parsimony. In addition, this paper provides an R script that students can easily modify to fit data sets of their choosing.
James Herndon
Herndon, J. (2023). P-Hacking Made Easy. Available online at Journal of Economics Teaching, DOI: 10.58311/jeconteach/d09c7366b1439d6cd0f16a520fedbde8ce64af60