References

American Psychological Association. 2001. Publication Manual of the American Psychological Association. 5th Edition.

Baguley, Thom. 2004. “Understanding statistical power in the context of applied research.” Applied Ergonomics 35 (2): 73–80. doi:10.1016/j.apergo.2004.01.002.

———. 2009. “Standardized or simple effect size: what should be reported?” British Journal of Psychology 100 (3): 603–17. doi:10.1348/000712608X377117.

Bakeman, Roger. 2005. “Recommended Effect Size Statistics for Repeated Measures Designs.” Behavior Research Methods.

Cockburn, Andy, Karl Gutwin, and Alan Dix. 2018. “HARK No More: On the Preregistration of Chi Experiments.” ACM.

Cohen, Jacob. 1977. “The t Test for Means.” In Statistical Power Analysis for the Behavioral Sciences, Revised Ed, 19–74. Academic Press. doi:10.1016/B978-0-12-179060-8.50007-4.

———. 1988. Statistical Power Analysis for the Behavioral Sciences. Lawrence Earlbaum Associates.

———. 1994. “The Earth Is Round (P<.05).” American Psychologist 49 (12). American Psychological Association: 997. http://ist-socrates.berkeley.edu/~maccoun/PP279_Cohen1.pdf.

Cumming, Geoff. 2013. Understanding the New Statistics: Effect Sizes, Confidence Intervals, and Meta-Analysis. Routledge.

———. 2014. “The New Statistics: Why and How.” Psychological Science 25 (1): 7–29. doi:10.1177/0956797613504966.

Cummings, Peter. 2011. “Arguments for and Against Standardized Mean Differences (Effect Sizes).” Archives of Pediatrics & Adolescent Medicine 165 (7): 592. doi:10.1001/archpediatrics.2011.97.

Dixon, Peter. 2003. “The P-Value Fallacy and How to Avoid It.” Canadian Journal of Experimental Psychology/Revue Canadienne de Psychologie Experimentale 57 (3). Canadian Psychological Association: 189. https://www.ncbi.nlm.nih.gov/pubmed/14596477.

Dragicevic, Pierre. 2016. “Fair Statistical Communication in Hci.” In Modern Statistical Methods for Hci, 291–330. Springer. https://hal.inria.fr/hal-01377894/document.

Earp, Brian D, and David Trafimow. 2015. “Replication, Falsification, and the Crisis of Confidence in Social Psychology.” Frontiers in Psychology 6. Frontiers Media SA. https://www.frontiersin.org/articles/10.3389/fpsyg.2015.00621/full.

Ehrenberg, ASC. 1977. “Rudiments of Numeracy.” Journal of the Royal Statistical Society. Series A (General). JSTOR, 277–97. http://www1.maths.leeds.ac.uk/~sta6ajb/math1910/p4.pdf.

Fisher, Ronald. 1955. “Statistical Methods and Scientific Induction.” Journal of the Royal Statistical Society. Series B (Methodological). JSTOR, 69–78. http://www.ssnpstudents.com/wp/wp-content/uploads/2015/02/Fisher-1955.pdf.

Gelman, Andrew. 2017. “Ethics and Statistics: Honesty and Transparency Are Not Enough.” Chance 30 (1). Taylor & Francis: 37–39. http://www.stat.columbia.edu/~gelman/research/published/ChanceEthics14.pdf.

Gelman, Andrew, and Eric Loken. 2013. “The Garden of Forking Paths: Why Multiple Comparisons Can Be a Problem, Even When There Is No ‘Fishing Expedition’ or ‘P-Hacking’ and the Research Hypothesis Was Posited Ahead of Time.” Department of Statistics, Columbia University. http://www.stat.columbia.edu/~gelman/research/unpublished/p_hacking.pdf.

Gelman, Andrew, Cristian Pasarica, and Rahul Dodhia. 2002. “Let’s Practice What We Preach: Turning Tables into Graphs.” The American Statistician 56 (2). Taylor & Francis: 121–30. https://pdfs.semanticscholar.org/202c/fec06a87fc96d3d56b6ad2ba4237b3fde141.pdf.

Gigerenzer, Gerd. 2004. “Mindless Statistics.” The Journal of Socio-Economics 33 (5). Elsevier: 587–606. http://pubman.mpdl.mpg.de/pubman/item/escidoc:2101336/component/escidoc:2101335/GG_Mindless_2004.pdf.

Gigerenzer, Gerd, and Julian N Marewski. 2015. “Surrogate Science: The Idol of a Universal Method for Scientific Inference.” Journal of Management 41 (2). Sage Publications Sage CA: Los Angeles, CA: 421–40. http://www.dcscience.net/Gigerenzer-Journal-of-Management-2015.pdf.

Giner-Sorolla, Roger. 2012. “Science or Art? How Aesthetic Standards Grease the Way Through the Publication Bottleneck but Undermine Science.” Perspectives on Psychological Science 7 (6). Sage Publications Sage CA: Los Angeles, CA: 562–71. http://journals.sagepub.com/doi/full/10.1177/1745691612457576.

Ioannidis, John PA. 2005. “Why Most Published Research Findings Are False.” PLoS Medicine 2 (8). Public Library of Science: e124. http://robotics.cs.tamu.edu/RSS2015NegativeResults/pmed.0020124.pdf.

Kampenes, Vigdis By, Tore Dybå, Jo E. Hannay, and Dag I.K. Sjøberg. 2007. “A Systematic Review of Effect Size in Software Engineering Experiments.” Information and Software Technology 49 (11): 1073–86. doi:https://doi.org/10.1016/j.infsof.2007.02.015.

Kaptein, Maurits, and Judy Robertson. 2012. “Rethinking Statistical Analysis Methods for Chi.” In Proceedings of the Sigchi Conference on Human Factors in Computing Systems, 1105–14. ACM. http://judyrobertson.typepad.com/files/chi2012_submission_final.pdf.

Kastellec, Jonathan P, and Eduardo L Leoni. 2007. “Using Graphs Instead of Tables in Political Science.” Perspectives on Politics 5 (04). Cambridge Univ Press: 755–71.

Kay, Matthew, Gregory L Nelson, and Eric B Hekler. 2016. “Researcher-Centered Design of Statistics: Why Bayesian Statistics Better Fit the Culture and Incentives of Hci.” In Proceedings of the 2016 Chi Conference on Human Factors in Computing Systems, 4521–32. ACM. http://www.mjskay.com/papers/chi_2016_bayes.pdf.

Kerr, Norbert L. 1998. “HARKing: Hypothesizing After the Results Are Known.” Personality and Social Psychology Review 2 (3). Sage Publications Sage CA: Los Angeles, CA: 196–217. http://www.socialrelationslab.com/uploads/1/8/9/6/18966149/harkingkerr1998.pdf.

Kirby, Kris N, and Daniel Gerlanc. 2013. “BootES: An R Package for Bootstrap Confidence Intervals on Effect Sizes.” Behavior Research Methods 45 (4). Springer: 905–27. http://web.williams.edu/Psychology/Faculty/Kirby/bootes-kirby-gerlanc-in-press.pdf.

Kruschke, John K, and Torrin M Liddell. 2017. “The Bayesian New Statistics: Hypothesis Testing, Estimation, Meta-Analysis, and Power Analysis from a Bayesian Perspective.” Psychonomic Bulletin & Review. Springer, 1–29. https://osf.io/ksfyr/download?format=pdf.

Lenth, Russel V. 2001. “Some practical guidelines for effective sample size determination.” The American Statistician 55 (3): 187–93. doi:10.1198/000313001317098149.

Loftus, Geoffrey R. 1993. “A Picture Is Worth a Thousand P Values: On the Irrelevance of Hypothesis Testing in the Microcomputer Age.” Behavior Research Methods, Instruments, & Computers 25 (2). Springer: 250–56. https://faculty.washington.edu/gloftus/Research/Publications/Manuscript.pdf/Loftus%20p-values%201993.pdf.

Norman, Geoff. 2010. “Likert Scales, Levels of Measurement and the ‘Laws’ of Statistics.” Advances in Health Sciences Education 15 (5). Springer: 625–32. https://pdfs.semanticscholar.org/6dc0/0756ab722370b815df1223f4044dd63841a8.pdf.

Nosek, Brian A, Charles R Ebersole, Alexander DeHaven, and David Mellor. 2017. “The Preregistration Revolution.” Open Science Framework. https://osf.io/2dxu5/download?format=pdf.

Olejnik, Stephen, and James Algina. 2003. “Generalized Eta and Omega Squared Statistics: Measures of Effect Size for Some Common Research Designs.” Psychological Methods.

Rosenthal, Robert. 1991. Meta-Analytic Procedures for Social Research. Vol. 6. Sage.

Simmons, Joseph P, Leif D Nelson, and Uri Simonsohn. 2011. “False-Positive Psychology: Undisclosed Flexibility in Data Collection and Analysis Allows Presenting Anything as Significant.” Psychological Science 22 (11). Sage Publications Sage CA: Los Angeles, CA: 1359–66. http://opim.wharton.upenn.edu/DPlab/papers/publishedPapers/Simmons_2011_False-Positive%20Psychology.pdf.

“Statistical Dances: Why No Statistical Analysis Is Reliable and What to Do About It.” 2017. https://tinyurl.com/gricad-dance. https://tinyurl.com/gricad-dance.

Stewart-Oaten, Allan. 1995. “Rules and Judgments in Statistics: Three Examples.” Ecology 76 (6). Wiley Online Library: 2001–9. http://onlinelibrary.wiley.com/doi/10.2307/1940736/full.

Taylor, John. 1997. Introduction to Error Analysis, the Study of Uncertainties in Physical Measurements. University Science Books.

“Transparent Statistics Website.” 2017. http://transparentstatistics.org/.

Tukey, John W. 1977. “Exploratory Data Analysis.” Reading, Mass.

Wierdsma, A. 2013. “What Is Wrong with Tests of Normality?” http://tinyurl.com/normality-wrong. http://tinyurl.com/normality-wrong.

Wilkinson, Leland. 1999a. “Statistical Methods in Psychology Journals: Guidelines and Explanations.” American Psychologist 54 (8). American Psychological Association: 594.

———. 1999b. “Dot Plots.” The American Statistician, February. Taylor & Francis Group. http://amstat.tandfonline.com/doi/abs/10.1080/00031305.1999.10474474.

Wilson, Max L, Wendy Mackay, Ed Chi, Michael Bernstein, Dan Russell, and Harold Thimbleby. 2011. “RepliCHI-Chi Should Be Replicating and Validating Results More: Discuss.” In CHI’11 Extended Abstracts on Human Factors in Computing Systems, 463–66. ACM. https://hal.inria.fr/file/index/docid/1000423/filename/RepliCHI-panel-2011.pdf.