It’s Uninterpretable!

August 6, 2020

No, it isn’t.

One of my favorite species of manuscript reviewer comment is that the data we are presenting are “uninterpretable”. Favorite as in the sort of reaction I get where I can’t believe my colleagues in science are this unbelievably stupid and are not completely embarrassed to say any such thing ever.

“Uninterpretable” is supposed to be some sort of easy-out Stock Critique, I do understand that. But it reveals either flagrant hypocrisy (i.e., the reviewer themselves would fall afoul of such a criticism with frequency) or serious, serious misunderstanding of how to do science.

Dr. Zen is the latest to run afoul of my opinion on this. He posted a Tweet:

and then made the mistake of bringing up the U word.

(his followup blog post is here)

Now, generally when I am laughing at a reviewer comment, it is not that they are using “uninterpretable” to complain about graphical design (although this occasionally comes into the mix). They usually mean they don’t like the design of the experiment(s) in some way and want the experiment conducted in some other way. Or the data analyzed in some other way (including graphical design issues here) OR, most frequently, a whole bunch of additional experiments.

“If the authors don’t do this then the data they are presenting are uninterpretable” – Reviewer # 3. It’s always reviewer #3.

Let me address Zen’s comment first. It’s ridiculous. Of COURSE the graph he presented is interpretable. It’s just that we have a few unknowns and some trust. A whole lot of trust. And if we’ve lost that, science doesn’t work. It just doesn’t. So it’s ridiculous to talk about the case where we can’t trust that the authors aren’t trying to flagrantly disregard norms and to lie to us with fake data. There’s just no point. Oh and don’t forget that Zen construed this in the context of a slide presentation. There just isn’t time for minutia and proving beyond any doubt that the presenter/authors aren’t trying to mislead with fakery.

Scientific communication assumes some reasonable common ground, particularly within a subfield. This is okay. When there is cross talk between fields with really, really different practices, ok, maybe a little extra effort is needed.

But this is a graph using the box-and-whiskers plot. This is familiar to the audience and indeed Zen does not seem to take issue with it. He is complaining about the exact nature of the descriptive statistic conventions in this particular box-and-whiskers plot. He is claiming that if this is not specified that the data are “uninterpretable”. NONSENSE!

These plots feature an indicator of central tendency of a distribution of observations, and an indicator of variablity in that distribution. Actually, most descriptive illustrations in science tackle this task.’s familiar. This particular type of chart gives two indications of the variability- a big one and a small one. This is baseline knowledge about the chart type and, again, is not the subject of Zen’s apparent ire. The line is the central tendency. The box outlines the small indicator and the whiskers outline the big indicator. From this we move into interpretation that is based on expectations. Which are totally valid to deploy within a subfield.

So if I saw this chart, I’d assume it was most likely depicting the central tendency of a median or mean. Most likely the median, particularly if the little dot indicates the mean. The box therefore outlines the intraquartile range, i.e., the 25%ile and 75%ile values. If the central tendency is the mean, then it is most likely that the box outlines plus or minus one standard error of the mean or one standard deviation. Then we come to the whiskers. I’d assume it was either the 95% Confidence Interval or the range of values.

I do NOT need to know which of these minor variants is involved to “interpret” the data. Because scientific interpretation functions along a spectrum of confidence in the interpretation. And if differences between distributions (aha another ready assumption about this chart) cannot be approximated from the presentation then, well, it’s okay to delve deeper. To turn to the inferential statistics. In terms of if the small indicator is SD or SEM? meh, we can get a pretty fair idea. If it isn’t the SD or SEM around a mean, or the 25%ile/75%ile around a median, but something else like 3SEM or 35/65? Well, someone is doing some weird stuff trying to mislead the audience or is from an entirely disparate field. The latter should be clear.

Now, of COURSE, different fields might have different practices and expectations. Maybe it is common to use 5 standard deviations as one of the indicators of variability. Maybe it is common to depict the mode as the indicator of central tendency. But again, the audience and the presenter are presumably operating in approximately the same space and any minor variations in what is being depicted do not render the chart completely uninterpretable!

This is not really any different when a manuscript is being reviewed and the reviewers cry “Uninterpretable!”. Any scientific paper can only say, in essence, “Under these conditions, this is what happened”. And as long as it was clear what was done and the nature of the data, the reporting of can be interpreted. We may have more or fewer caveats. We may have a greater or smaller space of uncertainty. But we can most certainly interpret.

It sometimes gets even worse and more hilarious. I have this common area where we present data where the error bars are smaller than the (reasonably sized) symbols for some (but not all) of the groups. And we may have cases where the not-different (by inferential stats *and* by any rational eyeball and consideration of the data at hand) samples cannot be readily distinguished from each other (think: overlapping longitudinal or dose curves).

“You need to use color or something else so that we can see the overlapping details or else it is all uninterpretable!” – Reviewer 3.

My position is that if the eye cannot distinguish any differences this is the best depiction of the data. What is an error is presenting data in a way that gives some sort of artificial credence to a difference that is not actually there based on the stats, the effect size and a rational understanding of the data being collected.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: