Tales of LOL-tastic anti-diversity professors
February 28, 2018
There was a comment from girlparts on my prior post which triggered an anecdote from my past. It seemed worth having its own post. I guess in a way it is relevant to the broader question of how one should react if someone speaks disparagingly of “diversity hire” professors. This little experience certainly went into helping me to see yet another way that the Defenders Of Quality are total hypocrites when it is something dear to them. Unsurprisingly because such individuals tend to lean conservative and therefore act like conservatives- i.e., selfishly hypocritical.
girlparts observed:
And, of course, members of underrepresented minorities are much less likely to be able to benefit from knowing someone famous etc.
During one of my science training stops I was in a Department that had a couple of these anti-affirmative action type established Professors. They were loud and confident so we were under no illusions whatsoever about what they thought about a whole host of things. They were walking reddit threads* long before reddit was a thing.
Relevant to this tale is that there were two individuals hired during my association with that Department that were widely and almost openly derided as “dean’s hire” affirmative action appointments. Particularly by the aforementioned rightwinger Defenders of Quality but you tended to hear it from everyone. EveryoneKnows(tm) They Are AffirmativeAction Hires That We Wouldn’t Have Hired Save For The Dean.
Of course they were generally shit on by the department. I was not privvy to specific details but I watched as they got crappy space (literally in the basement), nobody seemed to want to collaborate and they always seemed to struggle to get access to resources. Both of them eventually left. This, bad as it is, is not the main point of the tale.
The main point is that a few years later there was a non-minority hire in the department. She had trained in the department and that alone was a tiny bit eyebrow raising because the Department definitely had the ethos of geographic nomadism being the best. It goes without saying that some of the Defenders of Quality were had been the loudest about how surely we could not hire our own trainees or anybody too well-associated with the department! That would compromise our quality.
But even better was the fact that soon after the hire it turned out that she was engaged to one of the established faculty. Naturally that guy was one of the jerkiest Defenders of Quality and most fervent Anti-Affirmative-Action Warriors. The most reddit of walking reddit threads. And here he was, engineering the tenure track Assistant Professor appointment of his soon-to-be spouse.
Of course the tale gets even better. There were at least four examples of women married to established professors in the department who had tried to get faculty appointments over the previous decade and a half. None of them got Asst Prof offers and had to settle for bad non-tenure track barely faculty appointments. They struggled along on the margins of slightly above adjunct teaching gigs and shoe string research activities. So on the one hand, of course this couple that pulled it off had to be totally secret about their relationship until after she’d gotten hired.
OTOH… oooooh, baby there were some angry folks.
__
*thanks to someone who may or may not choose to self-identify in the comments for this little gem
Thought of the day
February 27, 2018
Someone on the Twitters was asking for ideas about what to say in response to faculty that say, dismissively, that other faculty members are “diversity hires”. The implication, stated or not by such folk, is that persons of color, or of nonXY chromosomal identity, are clearly inferior merely because of such identities.
In context of prospective new faculty during a hiring cycle, the VeryConcerned person often asserts that they are only concerned with keeping up the standards of the department.
“Can’t have all these inferior diversity hires dragging us down, chaps! Hrm, hrm.”
My thought is this.
In science, the young, new hires are always better than the department’s current average. They have more cutting edge techniques, fresher ideas, less historical baggage and/or likely better collaborative relationships. They are not yet burned out, quite the contrary.
So the VeryConcernedColleague can rest at ease. The new hire is going to improve the Department, no matter who is hired out of the Long List of reasonably attractive candidates.
Generalization, not “reproducibility”
February 26, 2018
The wikipedia entry on Generalization reads:
A generalization (or generalisation) is the formulation of general concepts from specific instances by abstracting common properties.
This is a very clean description of what many scientists think that they are about. I certainly do. I think that we are trying to use our experiments as specific instances from which to identify concepts and phenomena that have common properties with other situations not currently being tested. Thus our results should, we hope, generalize as predictions of what will happen in other situations.
Usually situations related to human health and behavior.
A recent paper by Voelkl and colleagues talks about this but totally borks the framing and terminology. They continually misuse “reproducibility” when they really mean to refer to generalization. And this harms science.
First, a quick overview. What Voelkl et al. present is a study which conducts meta-analysis of published studies. This technique includes a host of different studies which use approximately the same methods to address approximately the same question. The outcome of such a meta-analysis can tell us if a given qualitative interpretation is more likely to be true than not (think of it as a box score of the outcomes weighted by some qualities of the specific studies) and estimate the effect size (distance of mean effect relative to the variation expected, i.e. Cohen’s d is most comprehensible to me).
As you can imagine, in a typical meta-analysis the studies vary quite a bit in detail. Perhaps it is the strain of rat being used. Or the sex. Or the light cycle the studies were run in. Perhaps it is the time of year or the humidity of the building. And most frequently there is variation in the scientists who are doing the investigating.
Meta-analysis is a test of generalization!
This is critical.
The big point in the Voelkl paper is that individual papers which include data sets on the same topic from multiple labs are more closely correlated with the meta-analytic result. As the authors put it in the Discussion:
Using simulated sampling, we compared the outcomes of single- and multi-laboratory studies, using the same overall number of animals, in terms of their accuracy of effect size estimates (pc) and FNR. For these simulations, we chose to use a large sample of published data from preclinical studies to guarantee that the results reflect real-life conditions. We found that pc increased substantially with the number of participating laboratories, without causing a need for larger sample sizes. This demonstrates that using more representative study samples through multi-laboratory designs improves the external validity and reproducibility of preclinical animal research.
Well, no shit Sherlock. A multi-laboratory study is already a test of generalization. It says that the same qualitative interpretation can be drawn from the study regardless of variation in laboratory, personnel and probably some other key variables. Since this is also what the meta-analysis is testing, it is no surprise whatever that this would be the result.
But. These authors use “reproducibility”. The Wikipedia entry on this topic is a disaster which conflates several key issues together, most pertinently generalization, reproducibility and replicability. It starts out okay:
Reproducibility is the ability to get the same research results or inferences, based on the raw data and computer programs provided by researchers.
Absolutely. Reproducibility is indeed the ability to reach the same conclusion (inferences) based on doing everything just like the other researchers did it. Great. It then immediately goes off the rails:
A related concept is replicability, meaning the ability to independently achieve non identical conclusions that are at least similar, when differences in sampling, research procedures and data analysis methods may exist.
what? That sounds more like a flexible version of reproducibility. If I had to try to parse out a difference for replicability I might observe the term “replicates” gives us a clue. As it does further down in the Wikipedia entry which now conflates the term repeatable with replicable.
The other component is repeatability which is the degree of agreement of tests or measurements on replicate specimens by the same observer in the same laboratory. … Although they are often confused, there is an important distinction between replicates and an independent repetition of an experiment. Replicates are performed within an experiment.
Seriously, who is editing this thing? Replicable now equals repeatable which means are all your subjects in the sample doing the same thing, more or less. I can get behind this needing a separate term but can we just pick one please? And not confuse that with the issue of whether the scientific result (“inference”) can be reproduced or will generalize?
Back to reproducibility.
A particular experimentally obtained value is said to be reproducible if there is a high degree of agreement between measurements or observations conducted on replicate specimens in different locations by different people
See how they’ve immediately diverged? This may or may not be generalization depending on what you call “replicate specimens”. To most eyes this means a whole different experiment which is for sure a test of generalization. Well the entry immediately clears up the intent is to conflate:
in science, a very well reproduced result is one that can be confirmed using as many different experimental setups as possible and as many lines of evidence as possible
The line about “as many different setups as possible” is the essence of generalization. And if that isn’t enough confusion this sentence brings in converging evidence which is another concept entirely!
Back to Voelkl et al.:
our results suggest that eliminating these and other risks of bias (e.g., low statistical power, analytical flexibility) is not sufficient to guarantee reproducibility; the results will remain idiosyncratic to the specific laboratory conditions unless these conditions are varied.
“Idiosyncratic” here means reproducible. It means that if you keep the conditions identical, you should be able to repeat the experiment over and over and come up with the same approximate finding (“inference”). This finding can be endlessly reproducible, be built on experiments that are highly replicable within the samples and still fail to generalize beyond the idiosyncratic way that a given lab chooses to run the experiment.
So why do I say this failure to be clear about what we mean harms science?
Well, we are deep in the midst of much furor about a “reproducibility crisis” in science. There isn’t one. Or at least if there is one, it has not been demonstrated clearly. The low grade annoyance of writing and reviewing the NIH grant section on Rigor is not a huge deal (at least it hasn’t been for me so far). But it is yet another thing for people to beat up grants, possibly for no good reason. On the other end of the scale this will eventually be grist for conservative Congress Critters trying to cut investment in research. Somewhere in between lies the goal of the BigPharma voices promoting the lie so as to further offload their research and development costs onto the public purse.
The more immediate problem is that if we are not clear about what we mean in this discussion, our solutions will never solve anything, and may even hurt. I believe that to some extent people are indeed accusing science of having a reproducibility problem. Meaning, one assumes, that significant amounts of published work come to inferences that cannot be sustained if the experiments are done in exactly the same way. The solution for this, one deduces, can only be that each lab much perform many replicate experiments to provide improved confidence on reproducibility prior to publishing. “Make those sloppy bastards repeat it six times and I won’t have to work so hard to figure out how to get my experiment working“, goes the thinking. I guess
One interpretation of what Voelkl and colleagues are saying is that this won’t help at all.
Besides known differences between the studies included in our analysis, such as the species or strain of animals (i.e., genotype) or reported differences in animal husbandry and experimental procedures, sources of variation included also many unknown and unknowable differences, such as the influence of the experimenter [38,39] or the microbiome [40], as well as subtle differences in visual, olfactory, and auditory stimulation. All those factors might affect treatment effects. Multi-laboratory designs are ideal to account for all of these sources of between-laboratory variation and should therefore replace standardized single-laboratory studies as the gold standard for late-phase preclinical trials
If we don’t do work in a way that can test how well a conclusion generalizes across these issues, we will never solve the real problem. We will not know the limits of said generalization (it is not one thing, btw), the key experimental factors and the irrelevant detail. Instead we will continue to promote a collection of arbitrary and highly constrained experimental parameters and talk as if surely our results will generalize to a treatment medication for humans in rapid order.
In point of fact working to improve reproducibility (as we all do!) may be directly opposed to improving generalization and thereby compromise translation to helping improve human health.
And despite where people in science are pointing the finger of blame (i.e., the reproducibility of inferences that we can make using precisely the same approaches), they are really motivated and angered by the lack of generalization.
Seriously, listen to what has the scientists who are eager to be puppeted by Big Pharma have to say. Listen to their supposed examples that show “the problem is real”. Look at what makes them really mad. Ask about their attempts to perform experiments related to the ones in the published literature that anger them so much. You will be more likely to conclude that they are not in fact miffed about directly reproducing a result. More often it is a failure to generalize beyond the original experimental conditions.
__
Voelkl B, Vogt L, Sena ES, Würbel H (2018) Reproducibility of preclinical animal research improves with heterogeneity of study samples. PLoS Biol 16(2): e2003693. https://doi.org/10.1371/journal.pbio.2003693
Manuscript review and the alleged reproducibility crisis
February 22, 2018
If you believe there is a “replication and reproducibility crisis” in science, you have limited proposal options.
Multiple independent replications of each phenomenon in a paper. Not n-up, but triplicate or more full power repeats.
Are you demanding this? Are you doing this in your own research?
Or, we create some way to encourage and enhance direct replications and simple tests of generalization from other labs or in subsequent papers.
I favor this one.
As it happens,I have had multiple incidences in the past few years which address this. As both an author and as a reviewer.
Editors and reviewers for decidedly modest JIF journals are overtly and explicitly saying replications and simple tests of generalization of a finding should not be published.
I can’t stress this enough. We’re not talking “this isn’t cool enough for this 2.5 JIF journal”. These are opinions that such things do not merit publication at all.
Findings that fail to replicate a prior finding (that is actually poorly supported) *simultaneously* take heat for not getting the same result.
Direct replication is too incremental and refutation / modification is too…..doubtful?
As my longer term Readers know, I tend to think this is just the way science works. If you keep at it your manuscript will find a home eventually. It is a PIA but it is not damning of the entire enterprise.
But if there is any validity to the reproducibility claims and you keep braying on about it…I want to know a lot more about how your reviewing behavior matches your fine talk.
I’m also looking forward to NIH grant review in about 3-4 years. We should be expecting the experienced PI to have some specific and concrete examples about their rigor.
Even if it is “this preprint shows our replication attempts, even if idiotic reviewers prevented them from being published”.
It is indubitably better for the postdoctoral training stint if the prospective candidate visits the laboratory before either side commits. The prospective gets a chance to see the physical resources, gets a chance for very specific and focused time with the PI and above all else, gets a chance to chat with the lab’s members.
The PI gets a better opportunity to suss out strengths and weaknesses of the candidate, as do the existing lab members. Sometimes the latter can sniff things out that the prospective candidate does not express in the presence of the PI.
These are all good things and if you prospective trainees are able to visit a prospective training lab it is wise to take advantage.
If memory serves the triggering twittscussion for this post started with the issue of delayed reimbursement of travel and the difficulty some trainees have in floating expenses of such travel until the University manages to cut a reimbursement check. This is absolutely an important issue, but it is not my topic for today.
The discussion quickly went in another direction, i.e. if it is meaningful to the trainee if the PI “won’t pay for the prospective to visit”. The implication being that if a PI “won’t” fly you out for a visit to the laboratory, this is a bad sign for the future training experience and of course all prospectives should strike that PI off their list.
This perspective was expressed by both established faculty and apparent trainees so it has currency in many stages of the training process from trainee to trainer.
It is underinformed.
I put “won’t” in quotes above for a reason.
In many situations the PI simply cannot pay for travel visits for recruiting postdocs.
They may appear to be heavily larded with NIH research grants and still do not have the ability to pay for visits. This is, in the experience of me and others chiming in on the Twitts, because our institutional grants management folks tell us it is against the NIH rules. There emerged some debate about whether this is true or whether said bean counters are making an excuse for their own internal rulemaking. But for the main issue today, this is beside the point.
Some PIs cannot pay for recruitment travel from their NIH R01(s).
Not “won’t”. Cannot. Now as to whether this is meaningful for the training environment, the prospective candidate will have to decide for herself. But this is some fourth level stuff, IMO. PIs who have grants management which works at every turn to free them from rules are probably happier than those that have local institutional policies that frustrate them. And as I said at the top, it is better, all else equal, when postdocs can be consistently recruited with laboratory visits. But is the nature of the institutional interpretation of NIH spending rules a large factor against the offerings of the scientific training in that lab? I would think it is a very minor part of the puzzle.
There is another category of “cannot” which applies semi-independently of the NIH rule interpretation- the PI may simply not have the cash. Due to lack of a grant or lack of a non-Federal pot of funds, the PI may be unable to spend in the recruiting category even if other PIs at the institution can do so. Are these meaningful to the prospective? Well the lack of a grant should be. I think most prospectives that seek advice about finding a lab will be told to check into the research funding. It is kind of critical that there be enough for whatever the trainee wants to accomplish. The issue of slush funds is a bit more subtle but sure, it matters. A PI with grants and copious slush fundes may offer a better resourced training environment. Trouble is, that this comes with other correlated factors of importance. Bigger lab, more important jet-setting PI…these are going to be more likely to have extra resources. So it comes back to the usual trade-offs and considerations. In the face of that it is unclear that the ability to pay for recruiting is a deciding factor. It is already correlated with other considerations the prospective is wrestling with.
Finally we get to actual “will not”. There are going to be situations where the PI has the ability to pay for the visit but chooses not to. Perhaps she has a policy never to do so. Perhaps he only pays for the top candidates because they are so desired. Perhaps she does this for candidates when there are no postdocs in the lab but not when there are three already on board. Or perhaps he doesn’t do it anymore because the last three visitors failed to join the lab*.
Are those bad reasons? Are they reasons that tell the prospective postdoc anything about the quality of the future training interaction?
__
*Extra credit: Is it meaningful if the prospective postdoc realizes that she is fourth in line, only having been invited to join the lab after three other people passed on the opportunity?
Should NIH provide a transcript of the discussion of grants?
February 16, 2018
Respected neuroscientist Bita Moghaddam seems to think this would be a good idea.
She then goes on to mention the fact that POs listen in on grant discussion, can take notes and can give the PI a better summary of the discussion that emerges in the Resume of Discussion written by the SRO.
This variability in PO behavior then leads to some variability in the information communicated to the PI. I’ve had one experience where a PO gave me such chapter and verse on the discussion that it might have been slightly over the line (pre- and post-discussion scores). Maybe two other ones where the PO gave me a very substantial run down. But for the most part POs have not been all that helpful- either they didn’t attend or they didn’t pay attention that closely or they just didn’t care to tell me anything past the “we suggest you revise and resubmit” mantra. She has a good point that it is not ideal that there is so much variability. When I’ve touched on this issue in the past, I’ve suggested this is a reason to cultivate as many POs as possible in your grant writing so that you have a chance of getting the “good” ones now and again. Would providing the transcript of discussion help? Maybe?
Or maybe we could just start lobbying the ICs of our fondest acquaintance to take the effort to make the POs behave more consistently.
But I have two problems with Professor Moghaddam’s proposals. First of course, is the quashing effect that de-anonymizing (and while a transcript could still be anonymized it is in the same vein of making reviewers hesitate to speak up) may have on honest and open comment. The second problem is that it goes into reinforcing the idea that properly revising a grant application is merely “doing what they said to do”. Which then should, the thinking goes, make the grant fundable next time.
This is, as you know, not the way the system is set to work and is a gut-feeling behavior of reviewers that the CSR works hard to counter. I don’t know if having the transcript would help or hurt in this regard. I guess it would depend on the mindset of the PI when reading the transcript. If they were looking to merely suss out* the relative ratio of seriousness of various critiques perhaps this would be fine?
__
*My fear is that this would just feed the people who are looking to litigate their review to “prove” that they got screwed and deserve funding.
Creative artists and the writing of scientific manuscripts
February 15, 2018
I am a consumer of the creative arts and, really, have always been in awe of creative artists. Looking back chronologically over my lifetime, my greatest consumption and appreciation has been fiction writing, music and cartooning (particularly the political variety). I’m not a big fan of flat art (sculpture speaks to me much more) but I am definitely amazed by what some people can paint, draw and the like. I do like moving picture arts but I don’t think I have any particular sense of awe for them as a craft and certainly not for the participants as creative artists*. I get that others can see this, however.
Anyway, the creative artists are amazing to me.
A couple of days ago it occurred to me that understanding the process of creative arts might help cross what I find to be a somewhat frustrating bridge in training other people to write scientific manuscripts.
Sidebar: I am pretty sure we’ve discussed related topics before on the blog, but I can’t remember when so I’m probably going to repeat myself.
When I first started to write scientific manuscripts I quite reasonably suffered the misunderstanding that you sort of did the experiments you planned and then wrote them (all of them) up in chronological order and badda boom, published it somewhere. That is because, I assume, many scientific manuscripts read as if that is how they were created. And there are probably some aspects of “Research Design 101” instruction that convinces young scientists that this is the way things work.
Then, when it is your own work, there are two additional factors that press down and shape your writing process. First, a sense of both pride and entitlement for your effort which tells your brain that surely every damn thing you worked on needs to fuel a publication. Second, a sense that writing is hard and you want to know in advance exactly what to write so that no effort is wasted.
“Wasted”.
And this is where the creative arts come in.
Now, I’ve never lived cheek by jowl with a creative artist and I am only superficially familiar with what they do. But I am pretty convinced it is an iterative, inefficient process. Flat art folks seem to sketch. A lot. They work on an eye. An expression. A composition. A leg. Apple. Pair of shoes. Novelists and short story authors work on themes. characters. plot elements. They write and tear their hair out. Some of this is developing skill, sure, but much of this for a reasonably mature creative person is just working the job. They create copious amounts of material that is only leading up to the final product.
And the final product, I surmise, is built from the practice elements. A plot or character for a story. A curve of a mouth for a portrait. Melody. Chord progressions. A painted sunbeam. The artist starts stitching together a complete work out of elements.
I think you need to get into this mode as a scientist who is writing up manuscripts.
We stitch together a work out of elements as well. Now in our case, the elements are not made up. They are data. That we’ve collected. And we spend a heck of a lot of time on the quality of those elements. But eventually, we need to tell a story from those parts.
N.b. This is not storyboarding. Storyboarding is setting out the story you want to tell and then later going out and creating the elements (aka, figures) that you need to tell this particular story. That way lies fraud.
The creative process is looking at the elements of truth that you have available to you, from your labors to create good data, and then trying to see how they fit together into a story.
The transition that one has to make as a scientist is the ability to work with the elements, put in serious labor trying to fit them together, and then being willing to scrap the effort and start over. I think that if you don’t get in there and do the work writing, writing, writing and analyzing and considering what the data are telling you, you make less progress.
Because the alternative is paralyzing. The alternative is that you keep putting off the creative process until something tells you how to write “efficiently”. Maybe it is that you are waiting for just the right experimental result to clarify a murky situation. Maybe you are waiting for your PI or collaborator or fellow trainee to tell you what to do, what to write, how to structure the paper.
I suppose it may look like this to a relatively inexperienced writer of manuscripts? That its a bit daunting and that if only the PI would say the right words that somehow it would be magically easy to “efficiently” write up the paper in the right way that she expects?
When I hear generic muttering from trainees about frustration with insufficient feedback from a mentor I sometimes wonder if this is the problem. An over expectation of specific direction on what to write, how to write and what the story is.
The PI, of course, wants the trainee to take their own shot at telling the story. Whereupon they will promptly red pen the hell out of all that “work” and tell the trainee to rewrite most of it and take a totally different tack. Oh, and run these two more experiments. And then the trainee wonders “why didn’t my PI tell me what she wanted in the first place instead of wasting my time??? GAh, I have the worst possible mentor!”
I realized within the past year or so that I have the same problem that I have criticized on the blog for years now. I tell new professors that they need to get away from the bench as quickly as possible and that this is not their job anymore. I tell them they have to find a way to get productivity out of their staff and that doing experiments is not their job anymore. I never had this problem as a transitioning scientist…I was fine getting away from the bench**.
But my equivalent is data analysis. And I’m not talking high falutin’ stuff that only I can do, either. I want to see the data! Study by study. As it rolls in, even. I want to examine it, roll it around in it. Create graphs and run some stats. Think about what it means and how it fits into my developing understanding of a research direction in our laboratory. I can’t wait to think about how this new figure might fit into one of our ongoing creative works…i.e., a manuscript.
I cannot give it up.
I create a lot of sketches, half plotted stories and cartoon panels. Elements. Themes. Drafts.
Many of these will never go into any published manuscript. If lucky some of these building blocks will make their way into a slide presentation or a into a grant as preliminary data. I never feel as though the effort is wasted, however. Making these bits and pieces is, to me, what allows me to get from here to there. From blank page to published manuscript.
Ideally, as I am supposedly training people to become independent scientists, I would like to train them to do this in the way that I do. And to get there, I have to get them across the hurdle of the creative artist. I have to get them to see that just rolling up your sleeves and doing the work is a necessary part of the process. You cannot be told a route, or receive a Revelation, that makes the process of creating a scientific manuscript efficient. You have to work on the elements. Make the sketches. Flesh out the plotlines.
And then be willing to scrap a bunch of “work” because it is not helping you create the final piece.
__
*I have a friend that is behind the camera on teevee shows. Big name teevee shows that you’ve heard of and watch. I see his work and I’m not really Seeing. His. Work. But this guy casually takes a few vacation pictures and I’m amazed at his eye, composition, etc. He doesn’t seem to even consider himself a still camera artist, acts like he considers himself barely a hobbyist at that! So clearly I’m missing something about moving picture photography.
**I’m not actually a bench scientist, the ~equivalent.
Theological waccaloons win because they are powered by religious fervor and exhaust normal people
February 14, 2018
Some self-congratulatory meeting of the OpenAccess Illuminati* took place recently and a summary of takeaway points has been posted by Stephen Curry (the other one).
These people are exhausting. They just keep bleating away with their talking points and refuse entirely to ever address the clear problems with their plans.
Anonymous peer review exists for a reason.
To hear them tell it, the only reason is so hateful incompetent reviewers can prevent their sterling works of genius from being published right away.
This is not the reason for having anonymous peer review in science.
Their critics regularly bring up the reason we have anonymous peer review and the virtues of such an approach. The OA Illuminati refuse to address this. At best they will vaguely acknowledge their understanding of the issue and then hand wave about how it isn’t a problem just …um…because they say so.
It’s also weird that 80%+ of their supposed problems with peer review as we know it are attributable to their own participation in the Glamour Science game. Some of them also see problems with GlamHumping but they never connect the dots to see that Glamming is the driver of most of their supposed problems with peer review as currently practiced.
Which tells you a lot about how their real goals align with the ones that they talk about in public.
Edited to add:
Professor Curry weighed in on twitter to insist that the goal is not to force everyone to sign reviews. See, his plan allows people to opt out if they choose. This is probably even worse for the goal of getting an even-handed and honest review of scientific papers. And even more tellingly, is designing the experiment so that it cannot do anything other than provide evidence in support of their hypothesis. Neat trick.
Here’s how it will go down. People will sign their reviews when they have “nice, constructive” things to say about the paper. BSDs, who are already unassailable and are the ones self-righteously saying they sign all their reviews now, will continue to feel free to be dicks. And the people** who feel that attaching their name to their true opinion will still feel pressure. To not review, to soft-pedal and sign or to supply an unsigned but critical review. All of this is distorting.
Most importantly for the open-review fans, it will generate a record of signed reviews that seem wonderfully constructive or deserved (the Emperor’s, sorry BSDs, critical pants are very fine indeed) and a record of seemingly unconstructive critical unsigned reviews (which we can surely dismiss because they are anonymous cowards). So you see? It proves the theory! Open reviews are “better” and anonymous reviews are mean and unjustified. It’s a can’t-miss bet for these people.
The choice to not-review is significant. I know we all like to think that “obvious flaws” would occur to anyone reading a paper. That’s nonsense. Having been involved in manuscript and grant review for quite some time now I am here to tell you that the assigned reviewers (typically 3) all provide unique insight. Sometimes during grant review other panel members see other things the three assigned people missed and in manuscript review the AE or EIC see something. I’m sure you could do parallel sets of three reviewers and it would take quite a large sample before every single concern has been identified. Comparing this experience to the number of comments that are made in all of the various open-commenting systems (PubMed Commons commenting system was just shuttered for lack of general interest by the way) and we simply cannot believe claims that any reviewer can be omitted*** with no loss of function. Not to mention the fact that open commenting systems are just as subject to the above discussed opt-in problems as are signed official review systems of peer review.
__
*hosted at HHMI headquarters which I’m sure tells us nothing about the purpose
**this is never an all-or-none associated with reviewer traits. It will be a manuscript-by-manuscript choice process which makes it nearly impossible to assess the quelling and distorting effect this will have on high quality review of papers.
***yes, we never have an overwhelmingly large sample of reviewers. The point here is the systematic distortion.
NIH encourages pre-prints
February 13, 2018
In March of 2017 the NIH issued a notice on Reporting Preprints and Other Interim Research Products (NOT-OD-17-050): “The NIH encourages investigators to use interim research products, such as preprints, to speed the dissemination and enhance the rigor of their work.“.
The key bits:
Interim Research Products are complete, public research products that are not final.
A common form is the preprint, which is a complete and public draft of a scientific document. Preprints are typically unreviewed manuscripts written in the style of a peer-reviewed journal article. Scientists issue preprints to speed dissemination, establish priority, obtain feedback, and offset publication bias.
Another common type of interim product is a preregistered protocol, where a scientist publicly declares key elements of their research protocol in advance. Preregistration can help scientists enhance the rigor of their work.
I am still not happy about the reason this happened (i.e., Glam hounds trying to assert scientific priority in the face of the Glam Chase disaster they themselves created) but this is now totally beside the point.
The NIH policy (see OpenMike blog entry for more) has several implications for grant seekers and grant holders which are what form the critical information for your consideration, Dear Reader.
I will limit myself here to materials that are related to standard paper publishing. There are also implications for materials that would never be published (computer code?) but that is beyond the scope for today’s discussion.
At this point I will direct you to bioRxiv and PsyRxiv if you are unfamiliar with some of the more popular approaches for pre-print publication of research manuscripts.
The advantages to depositing your manuscripts in a pre-print form are all about priority and productivity, in my totally not humble opinion. The former is why the Glamour folks are all a-lather but priority and scooping affect all of us a little differently. As most of you know, scooping and priority is not a huge part of my professional life but all things equal, it’s better to get your priority on record. In some areas of science it is career making/breaking and grant getting/rejecting to establish scientific priority. So if this is a thing for your life, this new policy allows and encourages you to take advantage.
I’m more focused on productivity. First, this is an advantage for trainees. We’ve discussed the tendency of new scientists to list manuscripts “in preparation” on their CV or Biosketch (for fellowship applications, say, despite it being technically illegal). This designation is hard to evaluate. A nearing-defense grad student who has three “in prep” manuscripts listed on the CV can appear to be bullshitting you. I always caution people that if they list such things they had better be prepared to send a prospective post-doc supervisor a mostly-complete draft. Well, now the pre-print allows anyone to post “in preparation” drafts so that anyone can verify the status. Very helpful for graduate students who have a short timeline versus the all too typical cycle of submission/rejection/resubmission/revision, etc. More importantly, the NIH previously frowned on listing “in preparation” or “in review” items on the Biosketch. This was never going to result in an application being returned unreviewed but it could sour the reviewers. And of course any rule followers out there would simply not list any such items, even if there was a minor revision being considered. With pre-print deposition and the ability to list on a NIH biosketch and cite in the Research Plan there is no longer any vaporware type of situation. The reviewer can look at the pre-print and judge the science for herself.
This applies to junior PIs as well. Most likely, junior PIs will have fewer publications, particularly from their brand new startup labs. The ability of the PI to generate data from her new independent lab can be a key issue in grant review. As with the trainee, the cycle of manuscript review and acceptance is lengthy compared with the typical tenure clock. And of course many junior PIs are trying to balance JIF/Glam against this evidence of independent productivity. So pre-print deposition helps here.
A very similar situation can apply to us not-so-junior PIs who are proposing research in a new direction. Sure, there is room for preliminary data in a grant application but the ability to submit data in manuscript format to the bioRxiv or some such is unlimited! Awesome, right?
Endnote
February 10, 2018
Nobody who is younger than me in the scientific generation sense should ever be manually entering references in manuscripts or grant applications.
Ever.
Rigor, reproducibility and the good kid
February 9, 2018
I was the good kid.
In my nuclear family, in school and in pre-adult employment.
At one point my spouse was in a very large lab and observed how annoying it is when the PI reads everyone the riot act about the sins of a few lab-jerks.
Good citizens find it weird and off-putting when they feel criticized for the sins of others.
They find it super annoying that their own existing good behavior is not recognized.
And they are enraged when the jerko is celebrated for finally, at last managing to act right for once.
Many of us research scientists feel this way when the NIH explains what they mean by their new initiative to enhance “rigor and reproducibility”.
____
“What? I already do that, so does my entire subfield. Wait…..who doesn’t do that?” – average good-kid scientist response to hearing the specifics of the R&R initiative.
SABV in NIH Grant Review
February 8, 2018
We’re several rounds of grant submission/review past the NIH’s demand that applications consider Sex As a Biological Variable (SABV). I have reviewed grants from the first round of this obligation until just recently and have observed a few things coming into focus. There’s still a lot of wiggle and uncertainty but I am seeing a few things emerge in my domains of grants that include vertebrate animals (mostly rodent models).
1) It is unwise to ignore SABV.
2) Inclusion of both sexes has to be done judiciously. If you put a sex comparison in the Aim or too prominently as a point of hypothesis testing you are going to get the full blast of sex-comparisons review. Which you want to avoid because you will get killed on the usual- power, estrus effects that “must” be there, various caveats about why male and female rats aren’t the same – behaviorally, pharmacokinetically, etc etc – regardless of what your preliminary data show.
3) The key is to include both sexes and say you will look at the data to see if there appears to be any difference. Then say the full examination will be a future direction or slightly modify the subsequent experiments.
4) Nobody seems to be fully embracing the SABV concept coming from the formal pronouncements about how you use sample sizes that are half males and half females into perpetuity if you don’t see a difference. I am not surprised. This is the hardest thing for me to accept personally and I know for certain sure manuscript reviewers won’t go for it either.
Then there comes the biggest categorical split in approach that I have noticed so far.
5a) Some people appear to use a few targeted female-including (yes, the vast majority still propose males as default and females as the SABV-satisfying extra) experiments to check main findings.
5b) The other take is just to basically double everything up and say “we’ll run full groups of males and females”. This is where it gets entertaining.
I have been talking about the fact that the R01 doesn’t pay for itself for some time now.
A full modular, $250K per year NIH grant doesn’t actually pay for itself.
the $250K full modular grant does not pay for itself. In the sense that there is a certain expectation of productivity, progress, etc on the part of study sections and Program that requires more contribution than can be afforded (especially when you put it in terms of 40 hr work weeks) within the budget.
The R01 still doesn’t pay for itself and reviewers are getting worse
I have reviewed multiple proposals recently that cannot be done. Literally. They cannot be accomplished for the price of the budget proposed. Nobody blinks an eye about this. They might talk about “feasibility” in the sense of scientific outcomes or preliminary data or, occasionally, some perceived deficit of the investigators/environment. But I have not heard a reviewer say “nice but there is no way this can be accomplished for $250K direct”.
Well, “we’re going to duplicate everything in females” as a response to the SABV initiative just administered the equivalent of HGH to this trend. There is approximately zero real world dealing with this in the majority of grants that slap in the females and from what I have seen no comment whatever from reviewers on feasibility. We are just entirely ignoring this.
What I am really looking forward to is the review of grants in about 3 years time. At that point we are going to start seeing competing continuation applications where the original promised to address SABV. In a more general sense, any app from a PI who has been funded in the post-SABV-requirement interval will also face a simple question.
Has the PI addressed SABV in his or her work? Have they taken it seriously, conducted the studies (prelim data?) and hopefully published some things (yes, even negative sex-comparisons)?
If not, we should, as reviewers, drop the hammer. No more vague hand wavy stuff like I am seeing in proposals now. The PI had better show some evidence of having tried.
What I predict, however, is more excuse making and more bad faith claims to look at females in the next funding interval.
Please prove me wrong, scientists in my fields of study.
__
Additional Reading:
NIH’s OER blog Open Mike on the SABV policies.
NIH Reviewer Guidance [PDF]
Undue influence of frequent NIH grant reviewers
February 7, 2018
A quotation
Currently 20% of researchers perform 75-90% of reviews, which is an unreasonable and unsustainable burden.
referencing this paper on peer review appeared in a blog post by Gary McDowell. It caught my eye when referenced on the twitts.
The stat is referencing manuscript / journal peer review and not the NIH grant review system but I started thinking about NIH grant review anyway. Part of this is because I recently had to re-explain one of my key beliefs about a major limitation of the NIH grant review system to someone who should know better.
NIH Grant review is an inherently conservative process.
The reason is that the vast majority of reviews of the merit of grant applications are provided by individuals who already have been chosen to serve as Principal Investigators of one or more NIH grant awards. They have had grant proposals selected as meritorious by the prior bunch of reviewers and are now are contributing strongly to the decision about the next set of proposals that will be funded.
The system is biased to select for grant applications written in a way that looks promising to people who have either been selected for writing grants in the same old way or who have been beaten into writing grants that look the same old way.
Like tends to beget like in this system. What is seen as meritorious today is likely to be very similar to what has been viewed as meritorious in the past.
This is further amplified by the social dynamics of a person who is newly asked to review grants. Most of us are very sensitive to being inexperienced, very sensitive to wanting to do a good job and feel almost entirely at sea about the process when first asked to review NIH grants. Even if we have managed to stack up 5 or 10 reviews of our proposals from that exact same study section prior to being asked to serve. This means that new reviewers are shaped even more by the culture, expectations and processes of the existing panel, which is staffed with many experienced reviewers.
So what about those experienced reviewers? And what about the number of grant applications that they review during their assigned term of 4 (3 cycles per year, please) or 6 (2 of 3 cycles per year) years of service? With about 6-10 applications to review per round this could easily be highly influential (read: one of the three primary assigned reviewers) review of 100 applications. The person has additional general influence in the panel as well, both through direct input on grants under discussion and on the general tenor and tone of the panel.
When I was placed on a study section panel for a term of service I thought the SRO told us that empaneled reviewers were not supposed to be asked for extra review duties on SEPs or as ad hoc on other panels by the rest of the SRO pool. My colleagues over the years have disabused me of the idea that this was anything more than aspirational talk from this SRO. So many empaneled reviewers are also contributing to review beyond their home review panel.
My question of the day is whether this is a good idea and whether there are ethical implications for those of us who are asked* to review NIH grants.
We all think we are great evaluators of science proposals, of course. We know best. So of course it is all right, fair and good when we choose to accept a request to review. We are virtuously helping out the system!
At what point are we contributing unduly to the inherent conservativeness of the system? We all have biases. Some are about irrelevant characteristics like the ethnicity** of the PI. Some are considered more acceptable and are about our preferences for certain areas of research, models, approaches, styles, etc. Regardless these biases are influencing our review. Our review. And one of the best ways to counter bias is the competition of competing biases. I.e., let someone else’s bias into the mix for a change, eh buddy?
I don’t have a real position on this yet. After my term of empaneled service, I accepted or rejected requests to review based on my willingness to do the work and my interest in a topic or mechanism (read: SEPs FTW). I’ve mostly kept it pretty minimal. However, I recently messed up because I had a cascade of requests last fall that sucked me in- a “normal” panel (ok, ok, I haven’t done my duty in a while), followed by a topic SEP (ok, ok I am one of a limited pool of experts I’ll do it) and then a RequestThatYouDon’tRefuse. So I’ve been doing more grant review lately than I have usually done in recent years. And I’m thinking about scope of influence on the grants that get funded.
At some point is it even ethical to keep reviewing so damn much***? Should anyone agree to serve successive 4 or 6 year terms as an empaneled reviewer? Should one say yes to every SRO request that comes along? They are going to keep asking so it is up to us to say no. And maybe to recommend the SRO ask some other person who is not on their radar?
___
*There are factors which enhance the SRO pool picking on the same old reviewers, btw. There’s a sort of expectation that if you have review experience you might be okay at it. I don’t know how much SROs talk to each other about prospective reviewers and their experience with the same but there must be some chit chat. “Hey, try Dr. Schmoo, she’s a great reviewer” versus “Oh, no, do not ever ask Dr. Schnortwax, he’s toxic”. There are the diversity rules that they have to follow as well- There must be diversity with respect to the geographic distribution, gender, race and ethnicity of the membership. So people that help the SROs diversity stats might be picked more often than some other people who are straight white males from the most densely packed research areas in the country working on the most common research topics using the most usual models and approaches.
**[cough]Ginther[cough, cough]
***No idea what this threshold should be, btw. But I think there is one.