A recent HuffPo piece on that rather flagrant bit of cover trolling from TIME magazine irritates me.
Read the rest of this entry »

One still occasionally gets whinging from some corner or other about not being able to run Analysis of Variance statistical procedures (ANOVA) because the data didn’t pass a test of normality. I.e., a test of whether they appear to fit a normal distribution.

Paper reviewers, trainees, colleagues….this can come from any corner. It betrays a grad-school class level of understanding of what statistical analysis of data is supposed to do…but not a grasp of what it is doing for us at a fundamental level within the conduct of science.

Your stock response should be “the ANOVA is robust against violations of normality, move along“.

I note that the company GraphPad, which makes the Prism statistical/curve fitting package beloved of behavioral pharmacologists, has a tidy FAQ answer.

The extract version:

A population has a distribution that may be Gaussian or not. A sample of data cannot be Gaussian or not Gaussian. That term can only apply to the entire population of values from which the data were sampled…In almost all cases, we can be sure that the data were not sampled from an ideal Gaussian distribution… an ideal Gaussian distribution includes some very low negative numbers and some superhigh positive values…When collecting data, there are constraints on the possible values…Other variables can…have physical or physiological limits that don’t allow super large values… plenty of simulations have shown that these tests work well even when the population is only approximately Gaussian…It is hard to define what “close enough” means, and the normality tests were not designed with this in mind.

I have a trainee running a study in which she is examining the effects of methamphetamine on Bunny Hopping using the established open field to hedgerow assay. The primary dependent variable is escape latency from stimulus onset to crossing the plane of the hedge.

She is examining the effects of a locomotor stimulant dose of methamphetamine derived from her pilot dose-response study versus vehicle in groups of Bunnies which have been trained for six weeks in our BunnyConditioning Model and age matched sedentary Bunnies. (The conditioning training consists of various sprint, long run, horizonal hop and vertical leap modules.)

So we have four groups of Bunnies as follows:
1. Conditioned, Vehicle
2. Conditioned, Meth
3. Sedentary, Vehicle
4. Sedentry, Meth

The trainee is actually a collaborating trainee and so these data involve the analytic input of multiple PIs in addition to the trainee’s opinio. We are having a slight disagreement over the proper analysis technique so I thought I would turn to the brilliant DM readers.

I stumbled back onto something I’ve been meaning to get to. It touches on both the ethical use of animals in research, the oversight process for animal research and the way we think about scientific inference.


Now, as has been discussed here and there in the animal use discussions, one of the central tenets of the review process is that scientists attempt to reduce the number of animals wherever possible. Meaning without compromising the scientific outcome, the minimum number of subjects required should be used. No more.


run more subjects..

We accept as more or less a bedrock that if a result meets the appropriate statistical test to the standard p < 0.05. Meaning that sampling the set of numbers that you have sampled 100 times from the same underlying population, fewer than five times will you get the result you did by chance. From which you conclude it is likely that the populations are in fact different.


There is an unfortunate tendency in science, however, to believe that if your statistical test returns p < 0.01 that this result is better. Somehow more significant, more reliable or more..real. On the part of the experimenter, on the part of his supervising lab head, on the part of paper reviewers and on the part of readers. Particularly the journal club variety.


Read the rest of this entry »

So you’ve just completed your last assays on physioprofitin signaling in the Namnezian complex. Lo and behold it is qaz-mediated, just like you suspected and the beccans are off the freaking chart. woot! PiT/PlS ratios are within relevant physiological ranges and still this work of art, your labor of love, came through with the experimental goods.
With a hope and a prayer you run your stats….and YES! p < 0.01!!!!!!!!!!!!!!!
What is the correct way to report your big Result?

The statistical analysis____________ qaz-mediated upregulation of physioprofitin in the Namnezian complex.polls

Andrew Gelman, professor of statistics at Columbia University, has joined the Scienceblogs as author of Applied Statistics. One look at his professional page and you will see why I am so happy to have him here:

Andrew has done research on a wide range of topics, including: …[snip]… methods in surveys, experimental design, statistical inference,

I think I am going to enjoy having this guy around. If you want to know what you are in for, visit his old blog for a taste.

In case you have been living under a rock (yet inexplicably reading this blog) Usain Bolt has now run both the 100 meter and 200 meter track events faster than anyone ever. The margin of improvement in the 100 meter event (which occurred earlier) was sufficient to start the sports world abuzz. Naturally, sports fans are willing to talk endlessly about the most absurd minutia and implications of such an event in terms pedestrian and embarrassingly overwrought.
YHN is no different.

Read the rest of this entry »