Beating the Stick With a Stick

The sordid history of the infamous Mann hockey-stick graph and Steve McIntyre's nearly decade-long efforts to expose its errors, on the Bishop Hill blog:
[O]ne of the chief criticisms of the hockey stick was the fact that its author, Michael Mann, had withheld the validation statistics so that it was impossible for anyone to gauge the reliability of the reconstruction. These validation statistics were to be key to the subsequent story. At the time of their press release Wahl and Amman had made public the computer code that they'd used in their papers. By the time their paper was submitted to Climatic Change, McIntyre had reconciled their work with his own so that he understood every difference. And he therefore now knew that Wahl and Amman's work suffered from exactly the same problem as the hockey stick itself: the R2 number was so low as to suggest that the hockey stick had no meaning at all, although another statistic, the reduction of error statistic (or RE) was relatively high. It was only this latter figure that had been mentioned in the paper. In other words, far from confirming the scientific integrity of the hockey stick, Wahl and Amman's work confirmed McIntyre's criticisms of it! McIntyre's first action as a peer reviewer was therefore to request from Wahl and Amman the verification statistics for their replication of the stick. Confirmation that the R2 was close to zero would strike a serious blow at Wahl and Amman's work.

Wahl and Amman's response was to refuse any access to the verification numbers, a clear flouting of the journal's rules. As a justification of this extraordinary action, they claimed that they had shown that McIntyre's criticisms had been rebutted in their forthcoming GRL paper, despite the fact that the paper had been rejected by the journal some days earlier. . .

Then, just a few weeks ago, and entirely unannounced, Wahl and Amman's Supplementary Information suddenly appeared on Caspar Amman's website. . . Now, with the code in front of him, McIntyre could see exactly what Wahl and Amman had done. . .

Wahl and Amman came up with a value which they called a calibration/verification RE ratio. As the name suggests, this was the ratio of the two RE numbers for calibration and verification. This ratio is however, entirely unknown to statistics, or to any other branch of science. But it was not plucked out of the air. The ratio and the threshold value which was set for it by Wahl and Amman was carefully calculated. They argued that any run with a ratio less than 0.75 should be assigned a score of -9999. Since the hockey stick had a score of 0.813, 0.75 was pretty much the highest level you could go to without rejecting the hockey stick itself. However if you set your ratio threshold too low, not enough runs would be rejected and the hockey stick would no longer be "99% significant". Some of the results of this ratio were entirely perverse - it was possible for a run that had scored a reasonably good RE in the calibration (there was a good correlation between it and the actual temperatures) to be thrown out of the final assessment on the grounds that it had done very well in the verification - the correlation with actual temperatures was considered too good!

With this new, and pretty much entirely arbitrary hurdle in place, Wahl and Amman were able to reject several of the runs which stood between the hockey stick and what they saw as its rightful place as the gold standard for climate reconstructions. That the statistical foundations on which they had built this paleoclimate castle were a swamp of misrepresentation, deceit and malfeasance was, to Wahl and Amman, an irrelevance. For political and public consumption, the hockey stick still lived, ready to guide political decision-making for years to come.
Another article about the "tree-ring bias" central to the hockey stick result is here. Simply put, the hockey stick data should not have been relied on by the IPCC.

