I find the h-index to be very strange today

June 30, 2011

I like to use ISI’s web of knowledge thingy to keep track of who is citing our* papers. Often times I’ll pull up a few that I haven’t seen yet that are related to our work.

Fortunately, I don’t have consistent cause to review other performance metrics having to do with my pubs because the whole thing kind of gives me a headache.

But I do, now and again, look at the h-index and ponder it. I’m not quite grasping what it tells us, other than one’s longevity in science, but whatever. Seems to me that if you take a slice of your own approximate science-age cohort, then it might be at least somewhat meaningful.

I have a bunch of peers in my approximate subfields, of my approximate career status and, most importantly, who started publishing at approximately the same time as I did. This is the “hold all else equal” analysis, or at least as close as it comes.

I recently took a look at the citation reports of some folks that I think, in a very general sense, have been kicking my ass on the metrics. Particularly salient to me is the rate of publications flying out with their names on them, since I see them pass by TOC and PubMed topic alerts. And in many cases the graph of pubs per year on ISI’s web of knowledge confirms that impression. But the number of *citations* per year seems to feature a lot less variance than I would think.

Hmm I says to myself.

Then I look at the h-indices and find even less variance than I would have thought.

So now I’m back to trying to grasp what this measure really means. In an intuitive sense, I mean; I grasp the definition**.

If someone has a significantly larger number of papers, this should result in a higher h-index, right? I mean just going by the odds of what is going to lead to greater or fewer numbers of citations. If there is a longer length of time of publication, ditto, as they accumulate. And I grasp the notion that different subfields of science are going to be more or less active, citation wise. But when you start making comparisons between individual scientists who have approximately the same length of publishing history in approximately the same subfields, you should be able to use this h-index more accurately. It should say something meaningfully different. And I’m not seeing that right now.

Unless you argue that regardless of numbers of published articles that might be anywhere from 1.5-3 fold higher, the scientists in this grouping only manage to pump out about the same number of papers that have “top-X” amount of citation interest within the field?
__
*and I’m increasingly using this as a tool to track through literature that cites other specific papers in subfields. I’m getting more and more impressed with this as a time saver for my rapidly atrophying brain.

**The h-index as defined by the creator Hirsch: A scientist has index h if h of [his/her] Np papers have at least h citations each, and the other (Np − h) papers have at most h citations each.

No Responses Yet to “I find the h-index to be very strange today”

  1. Charles Says:

    It’s a little bit hard to understand what you’re actually saying. You’re saying that these guy publish 3x as many papers as you do, but still have the same number of total citations as you? If so, why is that confusing? It just means that your papers get cited more, right?

    Now maybe that in itself is confusing if you think that on average their papers are about as good as yours. But that’s impossible for us to assess.

    Like

  2. Pinko Punko Says:

    It may really only be best at determining people publishing a lot of low impact papers. If people have similar impact by the measure they are going to have similar h- a few more papers but a few less cites, or vice versa. It is a reasonable statistic, but I think only less impact really shows up.

    Although with these useful tools (WoS, Scopus), I bet the “they cited us, we’ll cite them” will go up, because those papers are easier to find in the sea of lit we’re all trying to keep up with.

    Like

  3. Drug Monkey Says:

    Not quite Charles. More that the “best” X papers of different scientists draw about the same. But not even that b/c it doesn’t account for the skew in the first X/4 or X/10 papers.

    More like “these scientists produce the same number of papers of at least Y amount of interest”…

    Like


  4. If someone has a significantly larger number of papers, this should result in a higher h-index, right?

    Nope. The number of papers obviously puts a cap on the h-index, but if you are mostly just publishing shitte in shitte journals that no one ever reads–let alone cites–then it isn’t gonna lift your h-index.

    Like

  5. drugmonkey Says:

    We are assuming here like-to-like comparisons of similar ~publishing venue, PP. Also the null, that there are no systematic differences in how interesting the scientists’ work is, on average.

    Perhaps I am seeing a rejection of the null..

    Like

  6. Pinko Punko Says:

    Similar pub venue and number of papers would result in similar index because the venue filter would be a mitigating factor.

    Like

  7. drugmonkey Says:

    Maybe Pinko. There is a big skew of citations within journal though.

    Like

  8. Pinko Punko Says:

    That is true, but in same area for journal I bet citations are similar. Like people that get Cell papers on p53 will have a similate cite rate, as will people that publish planar cell polarity stuff in cell, but likely won’t be the same.

    I think it is a reasonable metric to identify consistent productivity.

    Like

  9. Arlenna Says:

    I’m feeling very smug now, because I went and checked my citation metrics and the flagship paper from my grad work has been cited a bunch, including as a key example of a subtype of something in a Nature Reviews article. Woot me.

    Like

  10. Neuro-conservative Says:

    This seems straightforward to me: your colleagues are probably pumping out quantity over quality.

    An additional, partial explanation is that h-index is a lagging indicator, as it takes time for citing papers to work their way into print. Thus, their productivity over the last year or two is not going to be adequately reflected in their h-index.

    At a broader level, it is also possible that you are defining your subfield so narrowly that you are restricting the range. Contrary to your implicit assumptions, I think the h-index is useful comparing individuals across sub disciplines to assess overall impact of a body of work.

    Like

  11. drugmonkey Says:

    If fields differ in size I suppose that is a difference in “impact” but it sure isn’t meaningful to me, N-c.

    Like

  12. Neuro-conservative Says:

    But why do you think they differ in size?

    Like

  13. (another) former academic Says:

    alternate hypothesis: the h-index doesn’t measure what it purports to measure.

    Look citations by paper is analogous to species abundances, which ecologists have been collecting for over a century. We’ve been plotting that data as rank-abundance graphs for just about as long. And yet no one summarizes a rank-abundance distribution with anything remotely similar to an h-index.

    Wonder why?

    Like

  14. Pinko Punko Says:

    Looked at my pater’s h index- it is higher than my grad advisor, except my grad advisor has many more papers with over 100 cites, but the tail of his papers declines in such a way that the overall h is 30-something, where as Pater Punko is 40 something, because his tail is fatter. I would say that both numbers are reasonable, but that both work in high citing fields, so relates more to the size of the field than the true impact.

    Like

  15. FunkDoctorX Says:

    I’ve always looked at the H-index relative to when someone started publishing. I’ve always made the sort of base assumption that a reasonably productive scientist should have an h-index equal to the amount of time since they received their PhD/when they started publishing. While I realize this is going to vary from field to field, I consider a good jumping off point when evaluating a PI as a potential adviser.

    But then, as you say, there are those folks who have a ton of middle author publications in which they presumably contribute very little to the pubs. So one has to question how much of an impact that individual actually had. I believe I’ve read somewhere that ISI or some other organization is considering using a weighting factor to weight the contribution of various authors based on their order to give a more accurate assessment of an authors contribution to the literature. An interesting idea methinks.

    Like

  16. Neuro-conservative Says:

    relates more to the size of the field than the true impact

    But again, this begs the question: why is a given field larger than another?

    Like

  17. drugmonkey Says:

    Because it is easy to be a sheeple bench jockey messing about with internally, circularly motivated crappola that doesn’t actually have meaning N-c?

    Like

  18. Neuro-conservative Says:

    An easy straw man to thump on,DM, but you don’t actually believe that mindless fad-ism is the primary driver of field size. If you did, you would be crusading against the NIH for mis-spending the large majority of it’s budget and utterly failing it’s mission. Whimple might believe that, but you certainly do not.

    Like

  19. BEM Says:

    This link may be of interest to the discussion here:

    http://michaelnielsen.org/blog/why-the-h-index-is-virtually-no-use/

    Like

  20. DrugMonkey Says:

    No, N-c, perhaps not fad-ism per se but perhaps more of a street light problem combined with the inevitable academic pressures. Pressures to parse and split an obsess unfettered by external checks for relevance. You may think of -omics as similar to post-modern literary criticism…

    Like

  21. Nobody Says:

    You may think of -omics as similar to post-modern literary criticism…

    Drugmonkey FTW.

    Like


  22. You may think of -omics as similar to post-modern literary criticism.

    This is totally fucken stupid. Genomics, proteomics, transcriptomics, etc are absolutely essential aspects of modern biological inquiry.

    Like

  23. drugmonkey Says:

    Post modern criticism is absolutely essential to understanding literature and therefore the true human condition.

    Like


  24. Post-modern literary criticism is a load of fucken horseshitte.

    Like

  25. drugmonkey Says:

    You say that because you disdain that you are unable to grasp.

    Like


  26. HAHAHAHAH! Yeah, sure thing. Tell us all about your robust appreciation for post-modern literary criticism.

    Like


Leave a comment