Saturday, August 18, 2007

Good statistics, bad statistics part I

The Lancet study of 2004, written by Roberts et al., on the mortality count before and after the US Coalition invasion of Iraq has been endlessly hit upon by conservative bloggers, and Michelle Malkin posted a critique of the study last month, written by David Kane of Harvard University. I've greatly wanted to explain why this study is so important, but also want to do both studies justice. Let me attempt to do it now.

The Lancet study

Roberts' study compared the mortality in the 14.6 months before the March 2003 invasion to the mortality in the 17.8 months after it. The tracking of deaths in Iraq is considered inaccurate because only a third of all deaths happen in hospitals. So the authors went to households in 33 clusters around the country of Iraq in an effort to estimate mortality in the two time periods of January 1, 2002 - March 18, 2003 and also in March 19, 2003 - September 20, 2004.

Interviews took place September 8-20, 2004. The idea is that each cluster will be representative of 1/33 of the country and can be used to extrapolate and come up with an estimate for the number of deaths before and after. (Given the lack of security in Iraq, it is clear that a full-blown census as performed in the US is impossible.) The study was designed to minimize risk to the interviewers while attempting to keep clusters random.

Each household, if it agreed to be interviewed, was asked for the age and sex of every current household member as well as to recall births, deaths, or visitors who stayed for more than 2 months as of January 1, 2002. (While sitting here, reading this, ask yourself: what can you recall of your household five and a half years ago?) If a death was reported, the interviewers attempted to confirm the deaths by way of death certificates.

Lots of statistics were gathered and generated. The study noted, "More than a third of reported post-attack deaths (n=53), and two-thirds of violent deaths (n=52) happened in the Falluja cluster. This extreme statistical outlier has created a very broad confidence estimate around the mortality measure and is cause for concern about the precision of the overall finding." So the researchers noted that Falluja may have had an impact on the numbers that were hard to address.

At the end, the paper estimates the risk of death increased by 2.5 times and, noting the statistical variability issues, estimated the 95 % confidence interval for this number as 1.6 - 4.2.

About confidence intervals

Before we go into the Kane study, let me explain what the heck a confidence interval is. For any given statistic, like this estimate of an increase in death of 2.5 times, we cannot be 100% sure of this 2.5 number unless we talked to everyone in Iraq. As we can't do that, we have to try and decide how "good" of an estimate this is.

We do this by figuring out how big of an interval we need so we can feel pretty sure the real value lies inside this interval. What do we mean by "pretty sure"? Well, if we use this standard 95% confidence interval, what we're saying is that if we could repeat our data gathering and repeat everything we did multiple times, then we could say that 95% of the time the value we calculated would fall in between the two numbers that make up the interval. In our case, 95% of the time a new data gathering and analysis would give us a number instead of 2.5 but it would still be between 1.6 and 4.2.

Now this number, 2.5, is a figure to indicate how much more the probability of dying had increased in Iraq. The risk of dying had gone up something like 2.5 times, according to this study. The larger this number, the more likely someone would die after the invasion. The smaller this number, the less likely someone would die after the invasion. If the number were 1, then there would be no increase. Note that the confidence interval found in the Lancet study had as its lower end the value 1.6. So the author was saying that if they kept repeating this study, then 95% of the time the increased probability of death would be no lower than 1.6.

The Kane Analysis

Now we're ready to look at Kane's paper. Kane's paper doesn't question the way the numbers were gathered or how well the numbers reflect the changes in mortality in Iraq. He looks at the analysis; he compares this overall increase of 2.5 times against the estimation of mortality rate before and after invasion. Each of these rates has its own confidence interval. He notes that the confidence interval for the mortality rate after invasion is eight times wider than that before invasion. That seems odd, considering the sample sizes for each are almost exactly the same. The issue is Falluja. Data from Falluja was included in the post-invasion mortality rate.
The Roberts paper noted:
There was one place, the city of Falluja that had just been devastated by shelling and bombing, and it was so far out of whack with all the others that it made our confidence intervals very, very wide.
Okay, that's good that they noted this. So what did the authors do? They did calculations both with and without the Falluja data. With the Falluja data, you get the 2.5 number and the confidence interval of 1.6 - 4.2. But if Falluja was excluded, the the number was instead 1.5 with a confidence interval of 1.1 to 2.3!

However, statisticians don't like to exclude data; my statistics professor in college was very adamant about the value of keeping outliers and what they can teach you.

Statisticians use the phrase "statistically insignificant." That phrase means the data doesn't conclusively show any change. Kane says that with Falluja included, the supposed increase in Iraqi mortality becomes statistically insignificant. Kane notes that including the data from Falluja would have resulted in making the confidence interval so big that there would be no way to conclude that the mortality rate went up.

Kane also notes that he compares the mortality rates before and after invasion, and basically the result aren't consistent. Kane recalculates the post-invasion mortality rate confidence interval with Falluja included, and because he doesn't have the actual data he makes a calculated guess using other data presented in Robert's paper. Using these other numbers, Kane says that you might be able to conclude that the mortality rate went down after invasion.

The main figure touted by the Lance paper is that 100,000 more deaths have occurred in Iraq since the coalition arrived. Kane notes that this number is calculated using the estimated death rates when the Falluja data is excluded. When he attempted to recalculate this estimate using the statistics that are based on data including Fallujah, the estimate changes to 264,000, but the conficence interval for this number is -130,000 to 659,000! Please note: the confidence interval includes 0, meaning NO additional deaths is a possibility.

Kane then goes on to prove, using some statistical arguments, that there is a definitely possibility that the mortality rate after invasion has actually gone down.

Other issues

It is rarely a good sign for a researcher to refuse to share his data; the key to good science is reproducibility. Yet the authors of the Lancet study (as of Kane's paper) had not shared their data. That isn't one of the seven signs of bogus science, but I think it should be.

My take on all this

I think the Robert et al. paper tried to do something extremely difficult: figure out mortality rates in a country that, at the time, did not have a firm grasp on the value of keeping track of these kinds of data. Any numbers they reported from Iraq were going to be difficult to support. I have some serious questions about the methodology they used to get their numbers, but I have no expertise in cluster analysis and therefore do not feel qualified to point out my criticisms. I think Kane's paper does a credible job of pointing out the inconsistencies in the analysis that takes place in the Lancet paper, and he is to be commended for his careful analysis.

It isn't clear to me why Roberts et al. won't share the data they collected. That acts as a big red warning flag for me, because it doesn't allow for independent verification of the analysis. That's a shame.

But I think the bigger issue is reporting science via the mainstream media. The only figure I heard reported in the media was "100,000 more died! 100,000 more deaths caused by the coalition forces!" There was no mention made of all the caveats in the study, the problems noted in the study, and since Kane's report has been released, I'm not aware of any media avenues talking about the problems in the Lancet study.

This problem of the media misreporting science and not reporting followup studies has been documented over and over again on the Junk Food Science Blog; Sandy Szwarc has done a fantastic job of showing just how terribly studies are reported in the media.

Let this be a lesson to all of us: if the media go ga-ga about a new scientific study, be very wary. If you can't read the study yourself, find a friend who can, and ask them to read the study and tell you what it means.

A silence too long

Alas, I have not written, not because I have nothing to write about, but too much. I feel a great pressure to spend significant time on each subject to give it the proper attention, but I have not had the strength or isolation that allows for it.

I hope to correct that this week.