Reading the Coverage of a Retraction: Failure to replicate is not evidence of fraud
August 10, 2010
The Twitts are all atwitter today about a case of academic misconduct. As reported in the Boston Globe:
Harvard University psychologist Marc Hauser — a well-known scientist and author of the book “Moral Minds’’ — is taking a year-long leave after a lengthy internal investigation found evidence of scientific misconduct in his laboratory.
The findings have resulted in the retraction of an influential study that he led. “MH accepts responsibility for the error,’’ says the retraction of the study on whether monkeys learn rules, which was published in 2002 in the journal Cognition.
There is an ongoing investigation and other allegations or admissions of scientific misconduct or fraud. More observations from the Nature blog The Great Beyond and NeuroSkeptic. We’ll simply have to see how that plays out. I have a few observations on the coverage so far however. Let’s start with the minor ones.
The PubMed page and the ScienceDirect publishers page have no indication that this paper has been retracted. I did a quick search for retraction, for Hauser and for tamarin on the ScienceDirect site and did not find any evidence of a published retraction notice by this method either. The Boston Globe article is datelined today but still. You would think that the publishers would have been informed of this situation loooong before it went public and would have the retraction linkage ready to roll.
The accusation in the paper correction by Hauser is, as is traditional, that the trainee faked it. As NeuroSkeptic points out, the overall investigation spans papers published well beyond the trainee in question’s time in the lab. Situations like this start posing questions in my mind about the tone and tenor of the lab and how that might influence the actions of a trainee. Not saying misconduct can’t be the lone wolf actions of a single bad apple. I’m sure that happens a lot. But I am equally sure that it is possible for a PI to set a tone of, let us say, pressure to produce data that point in a certain direction.
What really bothered me about the Globe coverage was this, however. They associate a statement like this one:
In 2001, in a study in the American Journal of Primatology, Hauser and colleagues reported that they had failed to replicate the results of the previous study. The original paper has never been retracted or corrected.
with
Gordon G. Gallup Jr., a professor of psychology at State University of New York at Albany, questioned the results and requested videotapes that Hauser had made of the experiment.
“When I played the videotapes, there was not a thread of compelling evidence — scientific or otherwise — that any of the tamarins had learned to correctly decipher mirrored information about themselves,’’ Gallup said in an interview.
In 1997, he co-authored a critique of the original paper, and Hauser and a co-author responded with a defense of the work.
What I am worried about in this type of coverage is the conflation of a failure to replicate a study with the absence of evidence (per the retraction blaming a trainee) with scientific debate over the interpretation of data.
The mere failure of an investigation to be able to replicate a prior one is not in and of itself evidence of scientific misconduct. Scientific findings, legitimate ones, can be difficult or impossible to replicate for many reasons and even if we criticize the credulity, scientific rigor or methods of the original finding, it is not misconduct. (Just so long as the authors report what they did and what they found in a manner consistent with the practices of their fields and the journals in which their data are published.) Even the much vaunted p<0.05 standard means that we recognize that 5 times out of a hundred experiments we are going to accept chance events as a causal chain resulting from our experimental manipulation.
Similarly, debates over what behavioral observation researchers think they see in animal behavior is not in and of itself evidence of misconduct. I mean, sure, if nobody other than the TruBelievers can ever see any smidge of evidence of the Emperor’s fine clothes in the videotapes proffered as evidence by a given lab, we might write them off as cranks. But this is, at this point, most obviously a debate about research design, controls, alternative hypothesis and potential confounds in the approach. The quote from Gordon Gallup taken in greater isolation (as in The Great Beyond blog entry) makes it sound more like perhaps he’s a disinterested party brought in as part of the investigation of scientific fraud when used in this context. In fact he appears to be a regular scientific critic of Hauser’s work. Gallup might be right, but I don’t like the way scientific debate is being conflated with scientific misconduct in this way.
Additional reading:
Harvard Magazine
Retraction Watch: including the text of the retraction to be published and a comment on Hauser serving as associate editor at the journal when his paper was handled.
Neuron Culture
John Hawks Weblog
melodye at Child’s Play
New York Times
New Scientist
__
Disclaimer: I come from a behaviorist tradition and and am more than a little skeptical of research traditions in the comparative cognition tradition that Hauser inhabits.
August 10, 2010 at 2:02 pm
Anon for today because I kinda know the guy…
DM, you say “The accusation in the paper correction by Hauser is, as is traditional, that the trainee faked it. As NeuroSkeptic points out, the overall investigation spans papers published well beyond the trainee in question’s time in the lab. Situations like this start posing questions in my mind about the tone and tenor of the lab and how that might influence the actions of a trainee. Not saying misconduct can’t be the lone wolf actions of a single bad apple. I’m sure that happens a lot. But I am equally sure that it is possible for a PI to set a tone of, let us say, pressure to produce data that point in a certain direction. ”
Knowing this discipline and some of the trainees to come out of this lab, I would say that this is probably a good hypothesis to start from. I’m not claiming any insider knowledge and I have never worked directly with him, so take what I say with a big ole grain of salt. But even though I think Hauser is a pretty good guy, and I think this whole thing is really unfortunate, I am not that surprised. I also think that even the best of us really want to believe what we believe. I can’t remember where I read this, but I do remember a story coming out recently about how people often end up with stronger beliefs in the face of negative evidence. I think scientists (especially, frankly, those in ev psych) can fall victim to this kind of thinking sometimes too.
Oh, and I loved the mixed metaphor of the lone wolf and bad apple ;).
LikeLike
August 10, 2010 at 5:13 pm
Like that Alex bird dealio?
LikeLike
August 10, 2010 at 7:12 pm
Check this out — from the newly-published “replication” of the 2007 Proc Royal Soc B where the tapes went missing:
Received July 7, 2010.
Accepted July 8, 2010.
LikeLike
August 10, 2010 at 8:55 pm
The talk of the town in my part of the world. There had been some rumors that he cut corners a little, but nothing about outright fraud.
LikeLike
August 10, 2010 at 11:58 pm
All those involved are being cagey on this topic, but the investigation has been well-known in cognitive psych circles for a few years. The issue is not a failure to replicate. Rather, the work in question doesn’t replicate, but the accusation is that it doesn’t replicate because it was made up to begin with.
At least, those have been the rumors. Since no one involved is talking, all we have to go on are the rumors.
LikeLike
August 11, 2010 at 12:57 am
Good post. You’re right that failure to replicate is not in itself evidence of misconduct.
I do wonder what happened with David Glynn – is he being used as a scapegoat re: the PSRB paper?
LikeLike
August 11, 2010 at 1:00 am
P.S. The Cognition retraction has been posted on retractionwatch along with a comment from the Editor.
“An internal examination at Harvard University of the research reported in “Rule learning by cotton-top tamarins,” Cognition 86 (2002), pp. B15-B22, found that the data do not support the reported findings. We therefore are retracting this article. MH accepts responsibility for the error.”
LikeLike
August 11, 2010 at 3:27 am
I think part of DrugMonkey’s point is that there is a gray area of “made up” in an area so fraught with methodological issues and experimenter bias that is not clearly on one side or the other of the line dividing “intentional fraud” from “honest mistake”.
LikeLike
August 11, 2010 at 4:46 am
you correctly point out that the p<0.05 standard leads to the occasional type I error (a false positive finding). Additionally, even when the original study is correct, the replicating study might suffer from a type II error. Sample size in rat experiments is rarely determined properly – people sometimes choose to use 6 rats, or 8 rats, out of habit.
LikeLike
August 11, 2010 at 9:39 am
They left out “Case of scotch received July 8, 2010”.
LikeLike
August 11, 2010 at 10:02 am
I had some prior thoughts on these types of “Yeah, we didn’t prove it before but we’re still right” followups
http://scientopia.org/blogs/drugmonkey/2010/05/12/it-matters-not-one-whit-to-the-ethics-of-data-faking-if-you-were-correct/
LikeLike
August 11, 2010 at 3:49 pm
Ping!
LikeLike
August 11, 2010 at 7:13 pm
Is it a matter of the data, or of the interpretation of the data?
LikeLike
August 11, 2010 at 7:36 pm
We don’t know yet what the character of the entire scope of misconduct is from the available reporting and available retractions.
LikeLike
August 12, 2010 at 1:18 am
[…] in a post I failed to note yesterday, described several ways in which the affair smelled funny. DrugMonkey pointed out an error (about misreading the significance of failures to replicate) that was emerging in coverage. I […]
LikeLike
August 12, 2010 at 7:29 am
Hauser’s situation is sad and disturbing and has a lot of ambiguity. I don’t know enough specifics of the case to comment on it, but I can comment in general. First, we do have to think of interpretations. It is well-known that different observers may interpretation situations differently.
Second, we know there is observer bias, where we have a tendency to see what we expect to see. That is why we need multiple observers–to challenge us if we have observer bias.
Third, this situation is yet another reminder of the importance of looking for negative cases; that is, observations that challenge or refute emerging observations. By seeking negative cases, researchers can specify conditions under which events occur and do not occur and they can also show patterns.
Fourth, this could be an argument for linking articles to videotaped data. In that way, other people can do their own interpretations. Of course, this leads the way to each of us possibly expounding on the basis of our biases, but such a course of action also opens up the scientific enterprise and invites vigorous debate.
Marc Hauser is a smart, accomplished man. I hope these are honest mistakes, questions of interpretation and not fraud. The world needs great thinkers like Marc Hauser, given the poverty of thought that dominates the US today in its politics.
Jane Gilgun
LikeLike
August 12, 2010 at 2:21 pm
Be real, people. This is not going to amount to honest mistakes and sloppy record keeping. If it did, Harvard would have handled this quietly and we wouldn’t even know about it. This is a big embarrassment for Harvard and they would have used any excuse to find him guilty of innocent mistakes. Instead, they called it “scientific misconduct”.
And there is no way this is just because of a failure to replicate. That happens all the time. If that was it, we’d constantly be hearing about professors taking a year leave and it wouldn’t be news. And it is not about different interpretations of results. That happens even more often. According to the NY Times, Gordon G. Gallup of the SUNY at Albany examined videotapes from the 1995 Hauser paper and saw no evidence to support the paper’s conclusions.
According to the New Scientist, the investigation started because his own students accused him of data fabrication. Fabrication, not sloppy mistakes or missing records. I am in the Boston area and I’ve heard the rumors about data fabrication myself.
This is terrible because it makes scientists look bad. There are already enough nutcases out there who don’t believe in science. The NIH will probably investigate this next. When they do, I’m betting we the result will “scientific fraud”.
LikeLike
August 15, 2010 at 8:38 pm
Having spent some time collaborating with Harvard biologists and working on campus, this sort of ethical shit is par for course at that institution. Harvard is packed full of egotistical careerists producing media-friendly research that’s more spin than substance, and that creates a certain tone and a set of pressures.
I am not surprised.
LikeLike
August 16, 2010 at 10:46 am
“harvard biologists”? Broad brush much?
LikeLike
August 17, 2010 at 8:18 am
Greg Laden knows Hauser and Hauser is no datafaker. Or something.
http://scienceblogs.com/gregladen/2010/08/what_i_know_about_marc_hauser.php
LikeLike
June 3, 2011 at 7:20 am
[…] I think we need to be exceptionally clear in the business of science that a failure to replicate is not, in fact evidence of fraud. […]
LikeLike
June 3, 2011 at 8:10 am
Egotism does not necessarily lead to fraud (and there is egotism everywhere, but more than a fair share at Harvard), but it does lead to a culture of assholes who think they can’t be wrong (and therefore don’t try very hard to disprove their own hypotheses) and trainees who are DESPERATE to please. It doesn’t excuse trainees who commit fraud, but PIs who create these bullshit factories a) don’t deserve their jobs, and b) are often very, very successful at publishing, grantsmanship, and academic fame.
LikeLike
June 8, 2013 at 5:16 pm
[…] Watch noted some inside-baseball wrinkles and potential publishing conflict-of-interest issues. DrugMonkey pointed out an error (about misreading the significance of failures to replicate) that was emerging in coverage. I […]
LikeLike
June 8, 2013 at 5:34 pm
[…] (very) quick stop is a post from DrugMonkey, an NIH researcher. DrugMonkey is admittedly just reading the coverage. But the speculations hint […]
LikeLike