Impact Score Versus Percentile Scatter Plot

August 9, 2010

Jeremy Berg–Director of the National Institute of General Medical Sciences–just posted this scatterplot of impact score versus percentile for hundreds of R01s assigned to NIGMS at his blog:

Overall Impact Score is the average final score from 1-9 given to the grant application by all the members of the study section, x10, and rounded (up?) to the nearest whole number. The percentile is the percent of applications reviewed by that same study section over the current round and the previous two that had better impact scores than that, and rounded up to the nearest whole number.

For those of us fascinated by inside NIH grants baseball, this is some serious fucken catnip, as there is a fuckton of interesting stuff in there. One of the most fascinating to me is the differences it reveals in scoring behavior for different study sections.

For example, looking at “milestone” impact scores of 20 and 30 reveals dramatic differences in “score inflation” in different study sections. One study section only scored 3% of its grants better than 20, while another study sections scored 20% of its grants better than 20. And one study section only scored 10% of its grants better than 30, while another scored 38% of its grants better than 30.

What would be truly fucken fascinating would be to redraw this scatterplot, with the dots representing funded grants drawn in green (for money!) and the dots representing unfunded grants in brown (for poop!).

Posted by Comradde PhysioProffe
Filed in NIH, NIH Careerism, NIH funding, Peer Review

No Responses Yet to “Impact Score Versus Percentile Scatter Plot”

DrugMonkey Says:

August 9, 2010 at 10:52 am
Agreed, this is one of the most fantastic datasets out of NIGMS yet. Along with other implications, it should be used with all new investigators just getting their first sets of scores to show them how things vary from section to section. Absolutist concepts of a “good” or “bad” score are simply the wrong way to think about it.

It is also worth a good BS session about whether all sections *should* have exactly the same calibrations or not.

LikeLike
physioprof Says:

August 9, 2010 at 11:37 am
The nearly thirty percentile range for impact score of thirty is mindblowing. That kind of huge variation in group scoring activity should be very frightening to those whose applications are reviewed in Special Emphasis Panels and percentiled against all-CSR.

LikeLike
DrugMonkey Says:

August 9, 2010 at 11:42 am
whose applications are reviewed in Special Emphasis Panels and percentiled against all-CSR

word. I’ve had SEP scores percentiled against the CSR base as well as against the parent study section. I never understood why the difference. Any thoughts? Could have been a change in policy enacted at some point but if so, I missed the notice on this….

LikeLike
DrugMonkey Says:

August 9, 2010 at 11:46 am
I am also amused by the number with overall scores of 10-15 or so. When the new instructions came down my Chair claimed that Scarpa or somebody emphasized in the chairs meeting that 1.0 scores were to be reserved for the best evah, once in a reviewing lifetime apps. Looks like not every study section got this message since these are scores from a single round of review and a mere 654 apps and all…

LikeLike
physioprof Says:

August 9, 2010 at 12:17 pm
Many SEPs don’t have a single “parent” study section.

LikeLike
DrugMonkey Says:

August 9, 2010 at 12:32 pm
Sure. and some do. are you thinking this is the only difference? When it is all applications from a single parent study section then the percentile is against the study section base, else against the CSR base?

LikeLike
physioprof Says:

August 9, 2010 at 12:33 pm
Maybe.

LikeLike
Anonymoustache Says:

August 10, 2010 at 4:23 am
So what’s the long term solution? Some kind of normalization of the scores relative to the median score meted out by the study section?

LikeLike
pinus Says:

August 10, 2010 at 5:54 am
Is there a way to get linear regressions from different study sections….in order to confirm that there are greatly differing scoring standards being used. I am not sure that this would be useful or even make sense.

LikeLike
DrugMonkey Says:

August 10, 2010 at 9:51 am
ICs could try z-scores compared to percentiles but one suspects a big part of the issue is differing shapes of the score distributions…

LikeLike

	Anonymous on Research Opportunities for New…
	Research Opportuniti… on A window on what is fair
	Research Opportuniti… on On targeting NIH funding oppor…
	Research Opportuniti… on NINDS Issues NOSI Requesting A…
	Links 6/7/22 \| Mike… on Grant awards and the new, new…
	Grant awards and the… on NIH tries, again, to keep gran…
	Grant awards and the… on Fighting with the New Biosketc…

S	M	T	W	T	F	S
1	2	3	4	5	6	7
8	9	10	11	12	13	14
15	16	17	18	19	20	21
22	23	24	25	26	27	28
29	30	31

DrugMonkey