10 August 2019
Slides at: https://rebrand.ly/Nagoya2019-Part1
Fraud = scientifc misconduct.
Today we don’t talk about fraud explicitly.
We talk about something much harder to identify and erradicate:
Questionable research practices (QRPs).
Term coined by John, Loewenstein, & Prelec (2012).
See also Simmons, Nelson, & Simonsohn (2011).
(John et al., 2012; Schimmack, 2015).
From Bem (2004):
“(…) [L]et us (…) become intimately familiar with (…) the data. Examine them from every angle. Analyze the sexes separately. Make up new composite indices. If a datum suggests a new hypothesis, try to find further evidence for it elsewhere in the data. If you see dim traces of interesting patterns, try to reorganize the data to bring them into bolder relief. If there are participants you don’t like, or trials, observers, or interviewers who gave you anomalous results, drop them (temporarily). Go on a fishing expedition for something– anything– interesting.”
This is not OK unless the exploration is explicity stated.
Daryl Bem is the author of the famous 2011 precognition paper
(data used in Part 2 of today’s workshop).
Prof. Brian Wansink at Cornell University.
His description of the efforts of a visiting Ph.D student:
I gave her a data set of a self-funded, failed study which had null results (…). I said, “This cost us a lot of time and our own money to collect. There’s got to be something here we can salvage because it’s a cool (rich & unique) data set.” I had three ideas for potential Plan B, C, & D directions (since Plan A had failed). I told her what the analyses should be and what the tables should look like. I then asked her if she wanted to do them.
Every day she came back with puzzling new results, and every day we would scratch our heads, ask “Why,” and come up with another way to reanalyze the data with yet another set of plausible hypotheses. Eventually we started discovering solutions that held up regardless of how we pressure-tested them. I outlined the first paper, and she wrote it up (…). This happened with a second paper, and then a third paper (which was one that was based on her own discovery while digging through the data).
This isn’t being creative or thinking outside the box.
This is QRPing.
Interestingly, science misconduct has been a longtime concern (see Babbage, 1830).
And for the sake of balance:
There are also some voices against this description of the current state of affairs (e.g., Fiedler & Schwarz, 2016).
Well, maybe not (yet).
Here’s an interesting preprint (from July 2019!) from a Japanese research group (Kyushu University):
Ikeda, A., Xu, H., Fuji, N., Zhu, S., & Yamada, Y. (2019). Questionable research practices following pre-registration [Preprint]. doi: 10.31234/osf.io/b8pw9
It is strongly related to incentives (Nosek, Spies, & Motyl, 2012; Schönbrodt, 2015).
But, very importantly, it also happens in spite of the researcher’s best intentions.
Munafò et al. (2017)
Until very recently (Makel, Plucker, & Hegarty, 2012).
How poorly we build theory (see Gelman):
“It is not unusual that (…) this ad hoc challenging of auxiliary hypotheses is repeated in the course of a series of related experiments, in which the auxiliary hypothesis involved in Experiment 1 (…) becomes the focus of interest in Experiment 2, which in turn utilizes further plausible but easily challenged auxiliary hypotheses, and so forth. In this fashion a zealous and clever investigator can slowly wend his way through (…) a long series of related experiments (…) without ever once refuting or corroborating so much as a single strand of the network.”
“(…) It was found that the average power (probability of rejecting false null hypotheses) over the 70 research studies was .18 for small effects, .48 for medium effects, and .83 for large effects. These values are deemed to be far too small.”
“(…) it is recommended that investigators use larger sample sizes than they customarily do.”