References

Alter, A. L., Oppenheimer, D. M., Epley, N., and Eyre, R. N. (2007). Overcoming intuition: Metacognitive difficulty activates analytic reasoning. Journal of Experimental Psychology: General, 136(4), 569–576. https://doi.org/10.1037/0096-3445.136.4.569
Anonymous. (2021). [98] evidence of fraud in an influential field experiment about dishonesty. Data Colada. http://datacolada.org/98
Bem, D. J. (2011). Feeling the future: Experimental evidence for anomalous retroactive influences on cognition and affect. Journal of Personality and Social Psychology, 100(3), 407–425. https://doi.org/10.1037/a0021524
Benjamin, D. J., Berger, J. O., Johannesson, M., Nosek, B. A., Wagenmakers, E.-J., Berk, R., Bollen, K. A., Brembs, B., Brown, L., Camerer, C., Cesarini, D., Chambers, C. D., Clyde, M., Cook, T. D., De Boeck, P., Dienes, Z., Dreber, A., Easwaran, K., Efferson, C., and Johnson, V. E. (2018). Redefine statistical significance. Nature Human Behaviour, 2(1), 6–10. https://doi.org/10.1038/s41562-017-0189-z
Bertrand, M., and Mullainathan, S. (2004). Are Emily and Greg More Employable Than Lakisha and Jamal? A Field Experiment on Labor Market Discrimination. American Economic Review, 94(4), 991–1013. https://doi.org/10.1257/0002828042002561
Brownback, A., and Novotny, A. (2018). Social desirability bias and polling errors in the 2016 presidential election. Journal of Behavioral and Experimental Economics, 74, 38–56. https://doi.org/10.1016/j.socec.2018.03.001
Camerer, C. F., Dreber, A., Forsell, E., Ho, T.-H., Huber, J., Johannesson, M., Kirchler, M., Almenberg, J., Altmejd, A., Chan, T., Heikensten, E., Holzmeister, F., Imai, T., Isaksson, S., Nave, G., Pfeiffer, T., Razen, M., and Wu, H. (2016). Evaluating replicability of laboratory experiments in economics. Science, 351(6280), 1433–1436. https://doi.org/10.1126/science.aaf0918
Costa, D. L., and Kahn, M. E. (2013). Energy conservation ’nudges’ and environmentalist ideology: Evidence from a randomized residential electricity field experiment. Journal of the European Economic Association, 11(3), 680–702. https://doi.org/10.1111/jeea.12011
Cressey, D. (2017). Tool for detecting publication bias goes under spotlight. Nature. https://doi.org/10.1038/nature.2017.21728
Deaton, A., and Cartwright, N. (2018). Understanding and misunderstanding randomized controlled trials. Social Science & Medicine, 210, 2–21. https://doi.org/10.1016/j.socscimed.2017.12.005
DellaVigna, S., Kim, W., and Linos, E. (2024). Bottlenecks for evidence adoption. Journal of Political Economy, 132(8), 2748–2789. https://doi.org/10.1086/729447
DellaVigna, S., and Linos, E. (2022). RCTs to Scale: Comprehensive Evidence From Two Nudge Units. Econometrica, 90(1), 81–116. https://doi.org/10.3982/ECTA18709
Dietvorst, B. J., Simmons, J. P., and Massey, C. (2015). Algorithm aversion: People erroneously avoid algorithms after seeing them err. Journal of Experimental Psychology: General, 144, 114–126. https://doi.org/10.1037/xge0000033
Funnell, S. C., and Rogers, P. J. (2011). The essence of program theory. Jossey-Bass. https://www.wiley.com/en-au/Purposeful+Program+Theory%3A+Effective+Use+of+Theories+of+Change+and+Logic+Models-p-9780470478578
Gelman, A., and Carlin, J. (2014). Beyond Power Calculations: Assessing Type S (Sign) and Type M (Magnitude) Errors. Perspectives on Psychological Science, 9(6), 641–651. https://doi.org/10.1177/1745691614551642
Gelman, A., Hill, J., and Vehtari, A. (2020). Regression and Other Stories. Cambridge University Press. https://doi.org/10.1017/9781139161879
Gelman, A., and Loken, E. (2013). The garden of forking paths: Why multiple comparisons can be a problem, even when there is no fishing expedition or p-hacking and the research hypothesis was posited ahead of time. https://sites.stat.columbia.edu/gelman/research/unpublished/p_hacking.pdf
Glennerster, R., and Takavarasha, K. (2013a). Analysis. In Running randomized evaluations: A practical guide (pp. 324–385). Princeton University Press. https://www.jstor.org/stable/j.ctt4cgd52.12
Glennerster, R., and Takavarasha, K. (2013b). Asking the right questions. In Running randomized evaluations: A practical guide (pp. 66–97). Princeton University Press. https://www.jstor.org.ezproxy.lib.uts.edu.au/stable/j.ctt4cgd52.7
Glennerster, R., and Takavarasha, K. (2013c). Outcomes and instruments. In Running randomized evaluations: A practical guide (pp. 180–240). Princeton University Press. https://www.jstor.org/stable/j.ctt4cgd52.9
Glennerster, R., and Takavarasha, K. (2013d). Randomizing. In Running randomized evaluations: A practical guide (pp. 98–179). Princeton University Press. https://www.jstor.org/stable/j.ctt4cgd52.8
Glennerster, R., and Takavarasha, K. (2013e). The experimental approach. In Running randomized evaluations: A practical guide (pp. 1–23). Princeton University Press. https://www.jstor.org/stable/j.ctt4cgd52.5
Glennerster, R., and Takavarasha, K. (2013f). Threats. In Running randomized evaluations: A practical guide (pp. 298–323). Princeton University Press. https://www.jstor.org/stable/j.ctt4cgd52.11
Glennerster, R., and Takavarasha, K. (2013g). Why randomize? In Running randomized evaluations: A practical guide (pp. 24–65). Princeton University Press. https://www.jstor.org/stable/j.ctt4cgd52.6
Harrison, G. W., and List, J. A. (2004). Field Experiments. Journal of Economic Literature, 47.
Haynes, L., Service, O., Goldacre, B., and Torgerson, D. (2012). Test, Learn, Adapt: Developing Public Policy with Randomised Controlled Trials. https://www.gov.uk/government/publications/test-learn-adapt-developing-public-policy-with-randomised-controlled-trials
Henrich, J., Boyd, R., Bowles, S., Camerer, C., Fehr, E., Gintis, H., and McElreath, R. (2001). In Search of Homo Economicus: Behavioral Experiments in 15 Small-Scale Societies. American Economic Review, 91(2), 73–78. https://doi.org/10.1257/aer.91.2.73
Iyengar, S. S., and Lepper, M. R. (2000). When choice is demotivating: Can one desire too much of a good thing? Journal of Personality and Social Psychology, 79(6), 995–1006. https://doi.org/10.1037/0022-3514.79.6.995
Johnson, H., and Wang-Ly, N. (2020). Using tax refunds for debt repayment. Financial Health Network. https://finhealthnetwork.org/research/financial-health-solutions-using-tax-refunds-for-debt-repayment/
Jung, M., and Seiter, M. (2021). Towards a better understanding on mitigating algorithm aversion in forecasting: An experimental study. Journal of Management Control, 32(4), 495–516. https://doi.org/10.1007/s00187-021-00326-3
Kahneman, D., and Tversky, A. (1979). Prospect theory: An analysis of decision under risk. Econometrica, 47(2), 263–291. https://doi.org/10.2307/1914185
Kenneally, E., and Dittrich, D. (2012). The Menlo Report: Ethical Principles Guiding Information and Communication Technology Research.
Kirgios, E. L., Mandel, G. H., Park, Y., Milkman, K. L., Gromet, D. M., Kay, J. S., and Duckworth, A. L. (2020). Teaching temptation bundling to boost exercise: A field experiment. Organizational Behavior and Human Decision Processes, 161, 20–35. https://doi.org/10.1016/j.obhdp.2020.09.003
Klein, R. A., Vianello, M., Hasselman, F., Adams, B. G., Adams, R. B., Alper, S., Aveyard, M., Axt, J. R., Babalola, M. T., Bahník, Š., Batra, R., Berkics, M., Bernstein, M. J., Berry, D. R., Bialobrzeska, O., Binan, E. D., Bocian, K., Brandt, M. J., Busching, R., … Nosek, B. A. (2018). Many Labs 2: Investigating Variation in Replicability Across Samples and Settings. Advances in Methods and Practices in Psychological Science, 1(4), 443–490. https://doi.org/10.1177/2515245918810225
Kristal, A. S., Whillans, A. V., Bazerman, M. H., Gino, F., Shu, L. L., Mazar, N., and Ariely, D. (2020). Signing at the beginning versus at the end does not decrease dishonesty. Proceedings of the National Academy of Sciences, 117(13), 7103–7107. https://doi.org/10.1073/pnas.1911695117
Lee, S. M. (2018). Sliced And Diced: The Inside Story Of How An Ivy League Food Scientist Turned Shoddy Data Into Viral Studies. BuzzFeed News. https://www.buzzfeednews.com/article/stephaniemlee/brian-wansink-cornell-p-hacking
List, J. A. (2011). Why Economists Should Conduct Field Experiments and 14 Tips for Pulling One Off. Journal of Economic Perspectives, 25(3), 3–16. https://doi.org/10.1257/jep.25.3.3
List, J. A. (2022a). The voltage effect: How to make good ideas great and great ideas scale. Currency.
List, J. A. (2022b). The five vital signs of a scalable idea and how to avoid a voltage drop. Behavioral Scientist. https://behavioralscientist.org/the-five-vital-signs-of-a-scalable-idea-and-how-to-avoid-a-voltage-drop/
List, J. A., Sadoff, S., and Wagner, M. (2011). So you want to run an experiment, now what? Some simple rules of thumb for optimal experimental design. Experimental Economics, 14(4), 439–457. https://doi.org/10.1007/s10683-011-9275-7
List, J., and Al-Ubaydii, O. (2014). The generalisability of experimental results in economics. CEPR. https://cepr.org/voxeu/columns/generalisability-experimental-results-economics
Manzi, J. (2012). Uncontrolled. Basic Books. https://www.hachettebookgroup.com/titles/jim-manzi/uncontrolled/9780465029310/?lens=basic-books
Mazar, N., Amir, O., and Ariely, D. (2008). The Dishonesty of Honest People: A Theory of Self-Concept Maintenance. Journal of Marketing Research, 45(6), 633–644. https://doi.org/10.1509/jmkr.45.6.633
Meyer, A., Frederick, S., Burnham, T. C., Guevara Pinto, J. D., Boyer, T. W., Ball, L. J., Pennycook, G., Ackerman, R., Thompson, V. A., and Schuldt, J. P. (2015). Disfluent fonts dont help people solve math problems. Journal of Experimental Psychology: General, 144(2), e16–e30. https://doi.org/10.1037/xge0000049
Meyer, M. N., Heck, P. R., Holtzman, G. S., Anderson, S. M., Cai, W., Watts, D. J., and Chabris, C. F. (2019). Objecting to experiments that compare two unobjectionable policies or treatments. Proceedings of the National Academy of Sciences, 116(22), 10723–10728. https://doi.org/10.1073/pnas.1820701116
Open Science Collaboration. (2015). Estimating the reproducibility of psychological science. Science, 349(6251), aac4716. https://doi.org/10.1126/science.aac4716
Rogers, P. J., Petrosino, A., Huebner, T. A., and Hacsi, T. A. (2000). Program theory evaluation: Practice, promise, and problems. New Directions for Evaluation, 2000(87), 5–13. https://doi.org/10.1002/ev.1177
Roth, A. E. (1995). Introduction to experimental economics (J. H. Kagel and A. E. Roth, Eds.). Princeton University Press. http://doi.org/10.2307/j.ctvzsmff5.5
Salganik, M. (2018). Bit by Bit. Princeton University Press. https://press.princeton.edu/books/paperback/9780691196107/bit-by-bit
Scheibehenne, B., Greifeneder, R., and Todd, P. M. (2010). Can there ever be too many options? A meta-analytic review of choice overload. Journal of Consumer Research, 37(3), 409–425. https://doi.org/10.1086/651235
Shu, L. L., Mazar, N., Gino, F., Ariely, D., and Bazerman, M. H. (2012). Signing at the beginning makes ethics salient and decreases dishonest self-reports in comparison to signing at the end. Proceedings of the National Academy of Sciences, 109(38), 15197–15200. https://doi.org/10.1073/pnas.1209746109
Silberzahn, R., Uhlmann, E. L., Martin, D. P., Anselmi, P., Aust, F., Awtrey, E., Bahník, Š., Bai, F., Bannard, C., Bonnier, E., Carlsson, R., Cheung, F., Christensen, G., Clay, R., Craig, M. A., Dalla Rosa, A., Dam, L., Evans, M. H., Flores Cervantes, I., … Nosek, B. A. (2018). Many Analysts, One Data Set: Making Transparent How Variations in Analytic Choices Affect Results. Advances in Methods and Practices in Psychological Science, 1(3), 337–356. https://doi.org/10.1177/2515245917747646
Simmons, J. P., Nelson, L. D., and Simonsohn, U. (2011). False-Positive Psychology: Undisclosed Flexibility in Data Collection and Analysis Allows Presenting Anything as Significant. Psychological Science, 22(11), 1359–1366. https://doi.org/10.1177/0956797611417632
The National Commission for the Protection of Human Subjects of Biomedical and Behavioral Research. (1979). The Belmont Report. U.S. Department of Health, Education, & Welfare, USA. https://www.hhs.gov/ohrp/regulations-and-policy/belmont-report/read-the-belmont-report/index.html
Verschuere, B., Meijer, E. H., Jim, A., Hoogesteyn, K., Orthey, R., McCarthy, R. J., Skowronski, J. J., Acar, O. A., Aczel, B., Bakos, B. E., Barbosa, F., Baskin, E., Bègue, L., Ben-Shakhar, G., Birt, A. R., Blatz, L., Charman, S. D., Claesen, A., Clay, S. L., … Yıldız, E. (2018). Registered Replication Report on Mazar, Amir, and Ariely (2008). Advances in Methods and Practices in Psychological Science, 1(3), 299–317. https://doi.org/10.1177/2515245918781032
Vivalt, E. (2020). How much can we generalize from impact evaluations? Journal of the European Economic Association, 18(6), 3045–3089. https://doi.org/10.1093/jeea/jvaa019