Why do we participate in manuscript review?
December 21, 2020
Why indeed.
I have several motivations, deployed variably and therefore, my answers to his question about a journal-less world vary.
First and foremost I review manuscripts as a reciprocal professional obligation, motivated by the desire I have to get my papers published. It is distasteful free-rider behavior to not review at least as often as you require the field to review for you. That is, approximately 3 times your number of unique-journal submissions. Should we ever move to a point where I do not expect any such review of my work to be necessary, then this prime motivator goes to zero. So, “none”.
The least palatable (to me) motivation is the gatekeeper motivation. I do hope this is the rarest of reviews that I write. Gatekeeper motivation leads to reviews that try really hard to get the editor to reject the manuscript or to persuade the authors that this really should not be presented to the public in anything conceivably related to current form. In my recollection, this is because it is too slim for even my rather expansive views on “least publishable unit” or because there is some really bad interpretation or experimental design going on. In a world where these works appeared in pre-print, I think I would be mostly unmotivated to supply my thoughts in public. Mostly because I think this would just be obvious to anyone in the field and therefore what is the point of me posturing around on some biorxiv comment field about how smart I was to notice it.
In the middle of this space I have the motivation to try to improve the presentation of work that I have an interest in. The most fun papers to review for me are, of course, the ones directly related to my professional interests. For the most part, I am motivated to see at least some part of the work in print. I hope my critical comments are mostly in the nature of
“you need to rein back your expansive claims” and only much less often in the vein of “you need to do more work on what I would wish to see next”. I hate those when we get them and I hope I only rarely make them.
This latter motivation is, I expect, the one that would most drive me to make comments in a journal-less world. I am not sure that I would do much of this and the entirely obvious sources of bias in go/no-go make it even more likely that I wouldn’t comment. Look, there isn’t much value in a bunch of congratulatory comments on a scientific paper. The value is in critique and in drawing together a series of implications for our knowledge on the topic at hand. This latter is what review articles are for, and I am not personally big into those. So that wouldn’t motivate me. Critique? What’s the value? In pre-publication review there is some chance that this critique will result in changes where it counts. Data re-analysis, maybe some more studies added, a more focused interpretation narrative, better contextualization of the work…etc. In post-publication review, it is much less likely to result in any changes. Maybe a few readers will notice something that they didn’t already come up with for themselves. Maybe. I don’t have the sort of arrogance that thinks I’m some sort of brilliant reader of the paper. I think people that envision some new world order where the unique brilliance of their critical reviews are made public have serious narcissism issues, frankly. I’m open to discussion on that but it is my gut response.
On the flip side of this is cost. If you don’t think the process of peer review in subfields is already fraught with tit-for-tat vengeance seeking even when it is single-blind, well, I have a Covid cure to sell you. This will motivate people not to post public, unblinded critical comments on their peers’ papers. Because they don’t want to trigger revenge behaviors. It won’t just be a tit-for-tat waged in these “overlay” journals of the future or in the comment fields of pre-print servers. Oh no. It will bleed over into all of the areas of academic science including grant review, assistant professor hiring, promotion letters, etc, etc. I appreciate that Professor Eisen has an optimistic view of human nature and believes these issues to be minor. I do not have an optimistic view of human nature and I believe these issues to be hugely motivational.
We’ve had various attempts to get online, post-publication commentary of the journal-club nature crash and burn over the years. Decades by now. The efforts die because of a lack of use. Always. People in science just don’t make public review type comments, despite the means being readily available and simple. I assure you it is not because they do not have interesting and productive views on published work. It is because they see very little positive value and a whole lot of potential harm for their careers.
How do we change this, I feel sure Professor Eisen would challenge me.
I submit to you that we first start with looking at those who are already keen to take up such commentary. Who drop their opinions on the work of colleagues at the drop of a hat with nary a care about how it will be perceived. Why do they do it?
I mean yes, narcissistic assholes, sure but that’s not the general point.
It is those who feel themselves unassailable. Those who do not fear* any real risk of their opinions triggering revenge behavior.
In short, the empowered. Tenured. HHMI funded.
So, in order to move into a glorious new world of public post-publication review of scientific works, you have to make everyone feel unassailable. As if their opinion does not have to be filtered, modulated or squelched because of potential career blow-back.
__
*Sure, there are those dumbasses who know they are at risk of revenge behavior but can’t stfu with their opinions. I don’t recommend this as an approach, based on long personal experience.
Updating the Erosion of the Purchase Power of the Modular NIH R01
December 16, 2020
I last updated this topic in mid 2018 using finalized BRDPI inflation adjustment numbers from 2016 and projections out to 2018. The latest numbers get us finalized values to 2019 and projections beyond that. There have been some minor changes from the last set of projections so it’s worth doing another update.

As you can see, the unrelenting march of inflation means that the spending power of the $250K NIH modular budget limit is now projected to be $138,678 for Fiscal Year 2021. This translates to 55.5% of the value in 2001. Looking at this another way, it takes $442,457 in 2021 dollars to equal the spending power of $250,000 in 2001.
So when you start demanding changes in the Modular limit at NIH, the proper value to lobby for is $450,000 per year in direct costs.
This is also critical for scientists who are getting their start now to understand when receiving career advice on grant strategy from colleagues and mentors who were in mid career in 2001. Their concepts of what you should be able to accomplish with “one R01 NIH grant” were established under far different conditions. It is unlikely that they have fully adjusted their thinking. They may need to be educated on these specific numbers.
Of course, the NIH is fully aware of this situation and has rejected multiple internal proposals to adjust the modular limit in the past. I’ve seen slide decks. As you can anticipate, the reason is to keep funding as many grants as possible so as to juke the success rate stats and pump up award numbers. This is also why across-the-board 10% cuts come down in times of budget stress- cut a module off of 9 awards and you get the 10th one free.
Note that this reality means that it now takes two R01 grants to have a lab running at the production level that one R01 would cover in 2001. And as we know, the odds of getting funded for any given grant submission are worse. I really don’t want to re-calculate the cumulative probability of now at least two grants, given X number of submissions. It would be too depressing. [ok, one quick one. The probability of 1 award in 10 tries when the hit rate is 17.7% is 85.7%, as mentioned in that prior post. This drops to 55.1% for the probability of at least two awards in 10 tries. ]
One of the potential takeaway messages from the Hoppe et al 2019 finding, and the Open Mike blogpost, is that if Black PIs want to have a better success rate for their applications, perhaps they should work on the right topics. The “right” topics, meaning the ones that enjoy the highest success rates. After all, the messaging around the release of Hoppe was more or less this: Black PI apps are only discriminated against because they are disproportionately proposed on topics that are discriminated against.
(Never mind that right in the Abstract of Hoppe they admit this only explains some 20% of the funding gap.)
We find, however, a curious counter to this message buried in the Supplement. I mentioned this in a prior post but it bears posting the data for a more memorable impact.

The left side of Figure S6 in Hoppe et al. shows the percent of applications within both the African-American/Black and WHite distributions that were submitted which landed in topic clusters across the success rate quintiles. The 1st is the best, i.e, the most readily funded topic cluster. We can see from this that while applications with white PIs are more or less evenly distributed, the applications with Black PIs are less frequently landing in the best funded topic clusters and more frequently landing in the lowest funded topic clusters. Ok, fine, this is the distributional description that underlies much of the takeaway messaging. On the right side of Figure S6, there is a Table. No idea why they chose that instead of a graph but it has the tendency to obscure a critical point.

Here I have graphed the data, which is the success rate for applications which fall into the topic-success quintiles by the race of the PI. This, first of all, emphasizes the subvocalized admission that even in the least-fundable topic clusters, applications with white PIs enjoyed a success advantage. Actually this main effect was present in each quintile, unsurprisingly. What also immediately pops out, in a way it does not with the table representation, is that in the best funded topic area the advantage of applications with white PIs is the greatest. Another way to represent this is by calculating the degree to which applications with Black PIs are disadvantaged within each quintile.

This represents the success for applications with Black PIs as a percentage of the success for applications with white PIs. As you can see, the biggest hit is at the first and fifth quintiles with the applications faring the best at the middle topic-success quintile. Why? Well one could imagine all kinds of factors having to do with the review of applications in those topic domains. The OpenMike blog post on ICs with lower funding rates (because they have tiny budgets, in part) may explain the fifth quintile but it wouldn’t apply to the top quintile. In fact quite the contrary. Ok, this depiction speaks to the relative hit to success rates within quintile. But the applicant might be a little more interested in the raw percentile hit, especially given the cumulative probability distributions we were discussing yesterday. Recall, the average difference was 7 percentile points (17.7% for Wh PI apps vs 10.7% for Black PI apps).

The disparity is highest in the first quintile. It is a hit of 10.8 percent, as opposed to the all-apps average hit of 7.0 percent.
Obviously we cannot draw much more from the available data. But it certainly cautions us that pushing Black applicants to work on the “right” topics is not a clear solution and may even be counter productive. This is on the acute level of a current PI deciding what to propose in an application, and what to pursue with a multi-application strategy over time. But it also, importantly, serves up a caution for pipeline solutions that try to get more Black trainees into the right labs so that they will work on the right topics using, in Dr. Collins’ parlance, “upgraded” “methodologies”. If this topic/race disparity is not resolved by the time these new trainees hit Assistant Professor stage, we are going to push more Black Professors into research domains that are even harder to succeed in.
We may eventually get some more insight. CSR promised this summer to start looking into study section behavior more closely. It may be that fewer applications from Black PIs in the most successful topic domains is due to disproportionately fewer Black PIs in those fields which leads to fewer of them on the relevant study sections. Even absent that factor, a lower presence in the fields of interest may drive more implicit or explicit bias against the ones that do choose those fields. We just can’t tell without more information about the constitution of study sections and the success rates that emerge from them. Oh, and the exception pay behavior of the Branches and Divisions within each IC. That’s also important to examine as it may relate to topic domain.
Historical success rates predict current attitudes, redux
December 3, 2020
It is hard to overstate the problem that plummeting success rates at the NIH have caused for biomedical science careers. We have expectations for junior faculty that were developed in the 1980s and maybe into the 90s. Attitudes that are firmly entrenched in our senior faculty who got their first awards in the 1980s or even the 1970s…and then were poised to really rake it in during the doubling interval (since undoubled). Time for a trip down memory lane.

The red trace depicts success rates from 1962 to 2008 for R01 equivalents (R01, R23, R29, R37). These are not broken down by experienced/new investigators status, nor are new applications distinguished from competing continuation applications. The blue line shows total number of applications reviewed and the data in the 60s are listed as “estimated” success rates. (source)
The extension of these data into more recent FY can be found over at the RePORTER. I like to keep my old graph because NIH has this nasty tendency to disappear the good old days so we’ll forget about how bad things really are now. From 2011 to 2017 success rates hovered from 17 to 19% and in the past two years we’ve seen 21-22% success.
In the historical trends from about 1980 to the end of the doubling in 2002 we see that 30% success rates ruled the day as expected average. Deviations were viewed as disaster. In fact the doubling of the NIH budget over a decade was triggered by the success rates falling down into the 25% range and everyone screaming at Congress for help. For what it is worth, the greybeards when I was early career were still complaining about funding rates in the early 1980s. Was it because they were used to the 40% success years right before that dropping down to 30%? Likely. When they were telling us “it’s all cyclical, we’ve seen this before on a decade cycle” during the post-doubling declines….well it was good to see these sorts of data to head off the gaslighting, I can tell you.
Anyway, the point of the day is that folks who had a nice long run of 30% success rates (overall; it was higher once you were established, aka had landed one grant) are the ones who set, and are setting, current expectations. Today’s little exercise in cumulative probability of grant award had me thinking. What does this analysis look like in historical perspective?

I’m using the same 17.7% success rate for applications with white PIs reported in Hoppe et al and 30% as a sort of historical perspective number. Relevant to tenure expectations, we can see that the kids these days have to work harder. Back in the day, applicants had a 83.2% cumulative probability of award with just 5 applications submitted. Seems quaint doesn’t it? Nowadays a white PI would have to submit 9 applications to get to that same chance of funding.
How does that square with usual career advice? Well, of course newbs should not submit R01 in the first year. Get the lab up and running on startup, maybe get a paper, certainly get some solid preliminary data. Put the grant in October in year 2 (triaged), wait past a round to do a serious revision, put it in for July. Triaged again in October of Year 3. Two grants in, starting Year 3. Well now maybe things are clicking a bit so the PI manages to get two new proposals together for Oct and/or Feb and if the early submission gets in, another revision for July. So in Fall of Year 4 we’re looking at four or five submissions with a fairly good amount of effort and urgency. This could easily stretch into late Year 4.
Where do the kids these days fit in four more applications?
The cumulative probability of getting NIH grants
December 3, 2020
One of the career strategies we have discussed numerous times in various contexts is how many grant applications one should be submitting to the NIH. I have been a consistent advocate for …more. This is a recognition that success rates on a per-application basis have been below 20% for most of my career. Obviously this particular target number varies a lot. Sometimes we are talking about paylines, since that seems to be a hard target for success. Recent paylines from the NCI have been in the high single digits- 7-9%. Or we may talk about NIH-wide success rates overall, accounting for not just payline funding but pickups above the payline. These numbers change from year to year but mid to upper teens is a pretty fair general estimate.
My usual argument is that investigators who want to get funded should start with the assumption that they are no better than anyone else and need to base their strategy off the average success rate….at the very least.
Dumb old me, math challenged in the extreme, may have even expressed this as something like “If the payline is 10%, you need to put in 10 applications to even be in the game”. The more math savvy of you immediately chime in to correct me about calculating cumulative probabilities. This is not a difficult concept. I get it. But my mind likes to forget about it and I’ve never taken a blog whack at this issue directly, that I can recall.

Thanks to this handy binomial probability tool I googled up, we can now contemplate the abyss. Let us suppose a per-application success rate of 17.7%. Mid to upper teens, a decent place to start our discussion. And let us gate on the cumulative probability of at least one award. Well, if you put in 5 applications, your odds of one of them funding is 62.2% and if you put in 10 applications, this is 85.7%. Not too shabby. But it puts a very fine point that probabilities of award do not add. Fifteen applications are required to get to a 95% cumulative probability of at least one being awarded.
Reminder: We are not talking payline here. We are talking the 17.7% hit rate for the NIH wide average of everything that goes into awards including all of those various pickup behaviors. If you want the assurance of making the payline at a place like NCI, well…..Lord help you, amirite? That sounds totally rough and brutal and even unfair.
Now. Suppose that for some reason, say that your skin reflectance categorizes you as Black instead of white, your success rate was 10.7% instead of 17.7%.

Your chances are, of course, somewhat different. The cumulative advantage of putting in more grants, aka working harder, accrues less surely here as well. I’ve color coded a few ~equivalent cumulative probabilities for convenience. Using the Hoppe success rates, a Black applicant has to put in 9 proposals to get approximately the same 62% chance a white applicant achieves with only 5 applications. This same PI would have to put in 18 proposals to approximate the 86% hit rate the white PI gets with only 10 applications. About 25 proposals to get the 95% hit rate enjoyed by white applicants who put in 15 proposals.
Insert my favorite staring eyes emojii available on other platforms.
I would estimate that many Black folks, in the academy and elsewhere, are somewhat used to the idea that they need to grind a bit harder to achieve. At some level it probably doesn’t even really bother some on the day to day.
But this is a LOT harder. Putting in just one grant proposal is not easy. Particularly when you are a brand new Professor. But it is not trivial, even when you are a Full professor with some time and experience under your belt. [ Oh, btw, sidebar: Ginther et al 2018 has a little comment that I probably missed originally. “In results not reported, being a full professor increased the probability of NIH funding by 11.9 ppt (p < .001), but the funding gap remained -12.8 ppt (p < .001).” Yeah, age doesn’t help. ] When we are talking 5, 10, 25 applications it is maybe easy to overlook the sweat that goes into making new and credible proposals. Sure, some can be revised proposals and some are retreads of various prior proposals. But they take work and sweat to make them competitive. You are not going to enjoy the NIH-wide average hit rate with consistently below-average proposals!
This brings me back to a related issue that appeared in the Ginther et al 2011 report. “Applications from black and Asian investigators were significantly less likely to receive R01 funding compared with whites for grants submitted once or twice……black and Asian investigators are less likely to be awarded an R01 on the first or second attempt, blacks and Hispanics are less likely to resubmit a revised application, and black investigators that do resubmit have to do so more often to receive an award.“. I will admit I still don’t quite understand what they are presenting here at the end. It reads as though they are gating on Black investigators who do eventually win an award and do so on revision, not the A0 (this sample was back in the A2-permitted days, iirc). This whole passage, however, can be received as “well, just work a little harder to compensate” and to appear as if we’re only talking about an extra revision or two. I probably received this in this way myself on initially seeing the 2011 paper. And I have to say the “1.7 fold advantage” that is discussed in Hoppe for the per-application success rates comes in the same way. It can be received as, well, you just have to write two to get one. Because it focuses us on the “given you did get an award” instead of what it takes to get that award, statistically speaking.
But looking at these cumulative probability graphs really hits differently.
Black PIs don’t have to work just a little harder.
Black PIs have to work a LOT harder.