Scientists fighting back on the “reproducibility crisis” hysteria
May 16, 2018
A new Op/Ed in PNAS takes up the reproducibility crisis.
A. D Redish, E. Kummerfeld, R. L. Morris, A. Love (2018) “Opinion: Reproducibility failures are essential to scientific inquiry” PNAS 115(20):5042-5046. [Journal Site]
Takeaway quote from the Abstract
Most importantly, these proposed policy changes ignore a core feature of the process of scientific inquiry that occurs after reproducibility failures: the integration of conflicting observations and ideas into a coherent theory.
As you will see, they had me at:
In many of these cases, what have been called “failures to replicate” are actually failures to generalize across what researchers hoped were inconsequential changes in background assumptions or experimental conditions
(Oh, wait, they cited me! AHAHAA, of course I like this thing!)
Seriously though, this is good stuff. Go read. Bookmark to forward to anyone who starts in on how there is a reproducibility “crisis”.
May 16, 2018 at 4:07 pm
Holy Citation of Anonymous Blogger, Batman.
LikeLike
May 17, 2018 at 12:01 pm
Within the spectrum of what people call “irreproducibility” there are some definite examples of shit-shows that need to be cleaned up. The reliability of antibodies used in western blotting (esp. those from SantaCruz) would fall inside this category. So would unique cell lines turning out to be HeLa cells, or the use of small Ns in basic science studies.
But, taking this approach of naming the low-hanging rotten fruit creates a problem – namely everyone who doesn’t use westerns or cell culture or small Ns is magically deemed off the hook for reproducibility (until such time as what they’re doing instead gets called into question). I’ve even seen R&R sections in NIH proposals written as “we don’t do the bad things so, we cool, right?”
The stuff that keeps me up at night is the super obscure big data stuff, where the methodology is so complex as to only be understood by a select few, and the rest of us plebs are supposed to go “oooh” at the pretty pictures, while not really having a clue how those images were arrived at. Hiding behind complex algorithms is a good wy to assure your reproducibility is never questioned. If some cool new insight is so unique that it can only ever be reproduced by someone who can reverse engineer your million lines of code, then the R&R problem instantly disappears up its own asshole!
LikeLike
May 17, 2018 at 4:30 pm
The trouble is that studies reporting reproducibility failures struggle to get the same limelight as the original studies which then continue to be considered the definitive article on the subject.
I’d also have a problem with being unable to replicate what ought to be a straightforward animal study (e.g on the ability of a particular antibiotic to cure infection with bug X). If the original authors reported 100% survival and upon trying to replicate it several times my survival was 0%, then I’d say this did represent a true failure to replicate.
If scientific findings can only be replicated in the same lab using exactly the same conditions, and same people, then is this even science? For instance the studies showing that results are dependent on the mouse source or the animal facility environment. To me, robust findings should be negligibly affected by such changes.
LikeLike
May 17, 2018 at 4:44 pm
Interesting article, definitely worth reading. But I’m not at all convinced that their examples apply to the fields in which “reproducibility crisis” idea is taken seriously – i.e. biomed and psychology.
The vast majority of examples in the article are in mathematics – where even if something is wrong, it is reproducible, because (pretty much) all the steps are written out. “Wrong but useful” is way more likely then. At the other end of the spectrum, if you measure something which a priori is likely to be small (e.g. measuring how much showing someone patriotic words affects their voting six months later), then by bad analysis/p-hacking find a big result, and then people fail to reproduce the result, we haven’t learned anything. (And it’s not just a generalization failure either – see, e.g. https://thehardestscience.com/2015/09/02/moderator-interpretations-of-the-reproducibility-project/ on “unknown moderators”/generalization). I think it is now accepted that a certain amount of fairly prominent social psych was “noise mining.”
The live question, in my mind, is whether biomedical science is more like mathematics or like social psych. Certainly in, e.g. cell biology, there are prominent results with small N and giant variability (and hidden nightmares like HeLa). Also, when I read blog posts like this: http://rajlaboratory.blogspot.com/2016/06/reproducibility-reputation-and-playing.html “anyone can easily leave out a key control that doesn’t go their way under the guise of “the cells looked funny that day”, but then again, there are days where the cells really do look funny” – it does worry me a little. But my assumption has been that biomed eventually figures out reproducibility and generalizability, because you have enough steps in a procedure that you can’t p-hack every one. I’ll be curious how the extensive reproducibility projects go.
LikeLike
May 17, 2018 at 4:51 pm
Yep. And there is a big problem that the onus is now on you to figure out why you got a negative (basically by slugging away until you get their positive) before you can publish. We need to change this approach.
LikeLike
May 17, 2018 at 5:05 pm
@Ola – “Hiding behind complex algorithms is a good way to assure your reproducibility is never questioned.” Yep. Though also a way to ensure that many people just eye-roll the paper and don’t care. Ridiculous that we don’t mandate code deposition for papers like this, honestly. (I don’t always do open code myself, but compared to the time-and-effort cost of really reproducible methods sections in experimental science, it seems like an obvious step to take.)
LikeLike
May 17, 2018 at 5:21 pm
DM — is it really that hard for biologists to publish negative results, or is it a cultural issue around what people think they should do? Journals like PLoS ONE or Scientific Reports supposedly support publication of negative results…
LikeLike
May 17, 2018 at 6:13 pm
Yes it is, and it is a cultural issue* of course. IME. And PLoS ONE does not work as intended. Again, IME.
*if by cultural issue you mean nobody even tries to submit such work, the answer to that is no. It’s the reviewers and editors at what are decidedly run of the mill workhorse journals that are putting up static.
LikeLike
May 18, 2018 at 8:35 am
The idea that single studies or papers substantively move Science forward is largely (thought obvs not entirely) wrong. For basic science results (where peoples lives in a clinical trial don’t depend on the value of a single t-test), it really doesn’t bother me if individual studies can’t be replicated, they will just fail to contribute to our larger gestalt of what is reality.
LikeLike