References

Alter, A. L., Oppenheimer, D. M., Epley, N., and Eyre, R. N. (2007). Overcoming intuition: Metacognitive difficulty activates analytic reasoning. Journal of Experimental Psychology: General, 136(4), 569–576. https://doi.org/10.1037/0096-3445.136.4.569

Anonymous. (2021). [98] evidence of fraud in an influential field experiment about dishonesty. Data Colada. http://datacolada.org/98

Bem, D. J. (2011). Feeling the future: Experimental evidence for anomalous retroactive influences on cognition and affect. Journal of Personality and Social Psychology, 100(3), 407–425. https://doi.org/10.1037/a0021524

Benjamin, D. J., Berger, J. O., Johannesson, M., Nosek, B. A., Wagenmakers, E.-J., Berk, R., Bollen, K. A., Brembs, B., Brown, L., Camerer, C., Cesarini, D., Chambers, C. D., Clyde, M., Cook, T. D., De Boeck, P., Dienes, Z., Dreber, A., Easwaran, K., Efferson, C., and Johnson, V. E. (2018). Redefine statistical significance. Nature Human Behaviour, 2(1), 6–10. https://doi.org/10.1038/s41562-017-0189-z

Bertrand, M., and Mullainathan, S. (2004). Are Emily and Greg More Employable Than Lakisha and Jamal? A Field Experiment on Labor Market Discrimination. American Economic Review, 94(4), 991–1013. https://doi.org/10.1257/0002828042002561

Brownback, A., and Novotny, A. (2018). Social desirability bias and polling errors in the 2016 presidential election. Journal of Behavioral and Experimental Economics, 74, 38–56. https://doi.org/10.1016/j.socec.2018.03.001

Camerer, C. F., Dreber, A., Forsell, E., Ho, T.-H., Huber, J., Johannesson, M., Kirchler, M., Almenberg, J., Altmejd, A., Chan, T., Heikensten, E., Holzmeister, F., Imai, T., Isaksson, S., Nave, G., Pfeiffer, T., Razen, M., and Wu, H. (2016). Evaluating replicability of laboratory experiments in economics. Science, 351(6280), 1433–1436. https://doi.org/10.1126/science.aaf0918

Costa, D. L., and Kahn, M. E. (2013). Energy conservation ’nudges’ and environmentalist ideology: Evidence from a randomized residential electricity field experiment. Journal of the European Economic Association, 11(3), 680–702. https://doi.org/10.1111/jeea.12011

Cressey, D. (2017). Tool for detecting publication bias goes under spotlight. Nature. https://doi.org/10.1038/nature.2017.21728

Deaton, A., and Cartwright, N. (2018). Understanding and misunderstanding randomized controlled trials. Social Science & Medicine, 210, 2–21. https://doi.org/10.1016/j.socscimed.2017.12.005

DellaVigna, S., Kim, W., and Linos, E. (2024). Bottlenecks for evidence adoption. Journal of Political Economy, 132(8), 2748–2789. https://doi.org/10.1086/729447

DellaVigna, S., and Linos, E. (2022). RCTs to Scale: Comprehensive Evidence From Two Nudge Units. Econometrica, 90(1), 81–116. https://doi.org/10.3982/ECTA18709

Dietvorst, B. J., Simmons, J. P., and Massey, C. (2015). Algorithm aversion: People erroneously avoid algorithms after seeing them err. Journal of Experimental Psychology: General, 144, 114–126. https://doi.org/10.1037/xge0000033

Funnell, S. C., and Rogers, P. J. (2011). The essence of program theory. Jossey-Bass. https://www.wiley.com/en-au/Purposeful+Program+Theory%3A+Effective+Use+of+Theories+of+Change+and+Logic+Models-p-9780470478578

Gelman, A., and Carlin, J. (2014). Beyond Power Calculations: Assessing Type S (Sign) and Type M (Magnitude) Errors. Perspectives on Psychological Science, 9(6), 641–651. https://doi.org/10.1177/1745691614551642

Gelman, A., Hill, J., and Vehtari, A. (2020). Regression and Other Stories. Cambridge University Press. https://doi.org/10.1017/9781139161879

Gelman, A., and Loken, E. (2013). The garden of forking paths: Why multiple comparisons can be a problem, even when there is no “fishing expedition” or “p-hacking” and the research hypothesis was posited ahead of time. https://sites.stat.columbia.edu/gelman/research/unpublished/p_hacking.pdf

Glennerster, R., and Takavarasha, K. (2013a). Analysis. In Running randomized evaluations: A practical guide (pp. 324–385). Princeton University Press. https://www.jstor.org/stable/j.ctt4cgd52.12

Glennerster, R., and Takavarasha, K. (2013b). Asking the right questions. In Running randomized evaluations: A practical guide (pp. 66–97). Princeton University Press. https://www.jstor.org.ezproxy.lib.uts.edu.au/stable/j.ctt4cgd52.7

Glennerster, R., and Takavarasha, K. (2013c). Outcomes and instruments. In Running randomized evaluations: A practical guide (pp. 180–240). Princeton University Press. https://www.jstor.org/stable/j.ctt4cgd52.9

Glennerster, R., and Takavarasha, K. (2013d). Randomizing. In Running randomized evaluations: A practical guide (pp. 98–179). Princeton University Press. https://www.jstor.org/stable/j.ctt4cgd52.8

Glennerster, R., and Takavarasha, K. (2013e). The experimental approach. In Running randomized evaluations: A practical guide (pp. 1–23). Princeton University Press. https://www.jstor.org/stable/j.ctt4cgd52.5

Glennerster, R., and Takavarasha, K. (2013f). Threats. In Running randomized evaluations: A practical guide (pp. 298–323). Princeton University Press. https://www.jstor.org/stable/j.ctt4cgd52.11

Glennerster, R., and Takavarasha, K. (2013g). Why randomize? In Running randomized evaluations: A practical guide (pp. 24–65). Princeton University Press. https://www.jstor.org/stable/j.ctt4cgd52.6

Harrison, G. W., and List, J. A. (2004). Field Experiments. Journal of Economic Literature, 47.

Haynes, L., Service, O., Goldacre, B., and Torgerson, D. (2012). Test, Learn, Adapt: Developing Public Policy with Randomised Controlled Trials. https://www.gov.uk/government/publications/test-learn-adapt-developing-public-policy-with-randomised-controlled-trials

Henrich, J., Boyd, R., Bowles, S., Camerer, C., Fehr, E., Gintis, H., and McElreath, R. (2001). In Search of Homo Economicus: Behavioral Experiments in 15 Small-Scale Societies. American Economic Review, 91(2), 73–78. https://doi.org/10.1257/aer.91.2.73

Iyengar, S. S., and Lepper, M. R. (2000). When choice is demotivating: Can one desire too much of a good thing? Journal of Personality and Social Psychology, 79(6), 995–1006. https://doi.org/10.1037/0022-3514.79.6.995

Johnson, H., and Wang-Ly, N. (2020). Using tax refunds for debt repayment. Financial Health Network. https://finhealthnetwork.org/research/financial-health-solutions-using-tax-refunds-for-debt-repayment/

Jung, M., and Seiter, M. (2021). Towards a better understanding on mitigating algorithm aversion in forecasting: An experimental study. Journal of Management Control, 32(4), 495–516. https://doi.org/10.1007/s00187-021-00326-3

Kahneman, D., and Tversky, A. (1979). Prospect theory: An analysis of decision under risk. Econometrica, 47(2), 263–291. https://doi.org/10.2307/1914185

Kenneally, E., and Dittrich, D. (2012). The Menlo Report: Ethical Principles Guiding Information and Communication Technology Research.

Kirgios, E. L., Mandel, G. H., Park, Y., Milkman, K. L., Gromet, D. M., Kay, J. S., and Duckworth, A. L. (2020). Teaching temptation bundling to boost exercise: A field experiment. Organizational Behavior and Human Decision Processes, 161, 20–35. https://doi.org/10.1016/j.obhdp.2020.09.003

Klein, R. A., Vianello, M., Hasselman, F., Adams, B. G., Adams, R. B., Alper, S., Aveyard, M., Axt, J. R., Babalola, M. T., Bahník, Š., Batra, R., Berkics, M., Bernstein, M. J., Berry, D. R., Bialobrzeska, O., Binan, E. D., Bocian, K., Brandt, M. J., Busching, R., … Nosek, B. A. (2018). Many Labs 2: Investigating Variation in Replicability Across Samples and Settings. Advances in Methods and Practices in Psychological Science, 1(4), 443–490. https://doi.org/10.1177/2515245918810225

Kristal, A. S., Whillans, A. V., Bazerman, M. H., Gino, F., Shu, L. L., Mazar, N., and Ariely, D. (2020). Signing at the beginning versus at the end does not decrease dishonesty. Proceedings of the National Academy of Sciences, 117(13), 7103–7107. https://doi.org/10.1073/pnas.1911695117

Lee, S. M. (2018). Sliced And Diced: The Inside Story Of How An Ivy League Food Scientist Turned Shoddy Data Into Viral Studies. BuzzFeed News. https://www.buzzfeednews.com/article/stephaniemlee/brian-wansink-cornell-p-hacking

List, J. A. (2011). Why Economists Should Conduct Field Experiments and 14 Tips for Pulling One Off. Journal of Economic Perspectives, 25(3), 3–16. https://doi.org/10.1257/jep.25.3.3

List, J. A. (2022a). The voltage effect: How to make good ideas great and great ideas scale. Currency.

List, J. A. (2022b). The five vital signs of a scalable idea and how to avoid a voltage drop. Behavioral Scientist. https://behavioralscientist.org/the-five-vital-signs-of-a-scalable-idea-and-how-to-avoid-a-voltage-drop/

List, J. A., Sadoff, S., and Wagner, M. (2011). So you want to run an experiment, now what? Some simple rules of thumb for optimal experimental design. Experimental Economics, 14(4), 439–457. https://doi.org/10.1007/s10683-011-9275-7

List, J., and Al-Ubaydii, O. (2014). The generalisability of experimental results in economics. CEPR. https://cepr.org/voxeu/columns/generalisability-experimental-results-economics

Manzi, J. (2012). Uncontrolled. Basic Books. https://www.hachettebookgroup.com/titles/jim-manzi/uncontrolled/9780465029310/?lens=basic-books

Mazar, N., Amir, O., and Ariely, D. (2008). The Dishonesty of Honest People: A Theory of Self-Concept Maintenance. Journal of Marketing Research, 45(6), 633–644. https://doi.org/10.1509/jmkr.45.6.633

Meyer, A., Frederick, S., Burnham, T. C., Guevara Pinto, J. D., Boyer, T. W., Ball, L. J., Pennycook, G., Ackerman, R., Thompson, V. A., and Schuldt, J. P. (2015). Disfluent fonts don’t help people solve math problems. Journal of Experimental Psychology: General, 144(2), e16–e30. https://doi.org/10.1037/xge0000049

Meyer, M. N., Heck, P. R., Holtzman, G. S., Anderson, S. M., Cai, W., Watts, D. J., and Chabris, C. F. (2019). Objecting to experiments that compare two unobjectionable policies or treatments. Proceedings of the National Academy of Sciences, 116(22), 10723–10728. https://doi.org/10.1073/pnas.1820701116

Open Science Collaboration. (2015). Estimating the reproducibility of psychological science. Science, 349(6251), aac4716. https://doi.org/10.1126/science.aac4716

Rogers, P. J., Petrosino, A., Huebner, T. A., and Hacsi, T. A. (2000). Program theory evaluation: Practice, promise, and problems. New Directions for Evaluation, 2000(87), 5–13. https://doi.org/10.1002/ev.1177

Roth, A. E. (1995). Introduction to experimental economics (J. H. Kagel and A. E. Roth, Eds.). Princeton University Press. http://doi.org/10.2307/j.ctvzsmff5.5

Salganik, M. (2018). Bit by Bit. Princeton University Press. https://press.princeton.edu/books/paperback/9780691196107/bit-by-bit

Scheibehenne, B., Greifeneder, R., and Todd, P. M. (2010). Can there ever be too many options? A meta-analytic review of choice overload. Journal of Consumer Research, 37(3), 409–425. https://doi.org/10.1086/651235

Shu, L. L., Mazar, N., Gino, F., Ariely, D., and Bazerman, M. H. (2012). Signing at the beginning makes ethics salient and decreases dishonest self-reports in comparison to signing at the end. Proceedings of the National Academy of Sciences, 109(38), 15197–15200. https://doi.org/10.1073/pnas.1209746109

Silberzahn, R., Uhlmann, E. L., Martin, D. P., Anselmi, P., Aust, F., Awtrey, E., Bahník, Š., Bai, F., Bannard, C., Bonnier, E., Carlsson, R., Cheung, F., Christensen, G., Clay, R., Craig, M. A., Dalla Rosa, A., Dam, L., Evans, M. H., Flores Cervantes, I., … Nosek, B. A. (2018). Many Analysts, One Data Set: Making Transparent How Variations in Analytic Choices Affect Results. Advances in Methods and Practices in Psychological Science, 1(3), 337–356. https://doi.org/10.1177/2515245917747646

Simmons, J. P., Nelson, L. D., and Simonsohn, U. (2011). False-Positive Psychology: Undisclosed Flexibility in Data Collection and Analysis Allows Presenting Anything as Significant. Psychological Science, 22(11), 1359–1366. https://doi.org/10.1177/0956797611417632

The National Commission for the Protection of Human Subjects of Biomedical and Behavioral Research. (1979). The Belmont Report. U.S. Department of Health, Education, & Welfare, USA. https://www.hhs.gov/ohrp/regulations-and-policy/belmont-report/read-the-belmont-report/index.html

Verschuere, B., Meijer, E. H., Jim, A., Hoogesteyn, K., Orthey, R., McCarthy, R. J., Skowronski, J. J., Acar, O. A., Aczel, B., Bakos, B. E., Barbosa, F., Baskin, E., Bègue, L., Ben-Shakhar, G., Birt, A. R., Blatz, L., Charman, S. D., Claesen, A., Clay, S. L., … Yıldız, E. (2018). Registered Replication Report on Mazar, Amir, and Ariely (2008). Advances in Methods and Practices in Psychological Science, 1(3), 299–317. https://doi.org/10.1177/2515245918781032

Vivalt, E. (2020). How much can we generalize from impact evaluations? Journal of the European Economic Association, 18(6), 3045–3089. https://doi.org/10.1093/jeea/jvaa019