UPDATE SEPTEMBER 11, 2014:
(This post was my initial take. For a better and more up-to-date report, go here. For follow-up, go here.)
The paper includes a bunch of psychology studies and public surveys as evidence of scientific endorsement of anthropogenic climate change, even a survey of cooking stove use. This vacates the paper. Lots of things vacate the paper, such as its inherently invalid and untrustworthy method, and the false claim that they used independent raters. But I think we will have broad agreement that the inclusion of psychology papers and public surveys vacate the paper. The world thought this was about climate science. This is ridiculous, but trivially predictable when you have human raters who have an obvious bias with respect to the subject of their ratings, who desire a specific outcome for the study, and who are empowered to deliver that outcome via their ratings. What happened here is exactly why we could never accept a "study" based on such a method.
The following papers were rated as endorsement and included in their 97% figure. Dana Nuccitelli even wanted to include a psychology paper about white males and denial as evidence of scientific endorsement. It's jaw dropping that someone who is supposed to inform the public on science would want to do that to the world, to generate a consensus figure based on studies that have no bearing on the consensus. There will be more such papers for those willing to invest time in this scam paper, and I haven't listed all that I found yet. I'll write this story up in a separate post when I have time, and for a news magazine. The broader ethics disaster here is going into a Nature submission:
Chowdhury, M. S. H., Koike, M., Akther, S., & Miah, D. (2011). Biomass fuel use, burning technique and reasons for the denial of improved cooking stoves by Forest User Groups of Rema-Kalenga Wildlife Sanctuary, Bangladesh. International Journal of Sustainable Development & World Ecology, 18(1), 88–97. (This is a survey of the public's stove choices in Bangladesh, and discusses their value as status symbols, defects in the improved stoves, the relative popularity of cow dung, wood, and leaves as fuel, etc. They mention climate somewhere in the abstract, or perhaps the word denial in the title sealed their fate.)
Boykoff, M. T. (2008). Lost in translation? United States television news coverage of anthropogenic climate change, 1995–2004. Climatic Change, 86(1-2), 1–11.
De Best-Waldhober, M., Daamen, D., & Faaij, A. (2009). Informed and uninformed public opinions on CO2 capture and storage technologies in the Netherlands. International Journal of Greenhouse Gas Control, 3(3), 322–332.
Tokushige, K., Akimoto, K., & Tomoda, T. (2007). Public perceptions on the acceptance of geological storage of carbon dioxide and information influencing the acceptance. International Journal of Greenhouse Gas Control, 1(1), 101–112.
Egmond, C., Jonkers, R., & Kok, G. (2006). A strategy and protocol to increase diffusion of energy related innovations into the mainstream of housing associations. Energy Policy, 34(18), 4042–4049.
Gruber, E., & Brand, M. (1991). Promoting energy conservation in small and medium-sized companies. Energy Policy, 19(3), 279–287.
Şentürk, İ., Erdem, C., Şimşek, T., & Kılınç, N. (2011). Determinants of vehicle fuel-type preference in developing countries: a case of Turkey. (This was a web survey of the general public in Turkey.)
Grasso, V., Baronti, S., Guarnieri, F., Magno, R., Vaccari, F. P., & Zabini, F. (2011). Climate is changing, can we? A scientific exhibition in schools to understand climate change and raise awareness on sustainability good practices. International Journal of Global Warming, 3(1), 129–141. (This paper is literally about going to schools in Italy and showing an exhibition.)
Palmgren, C. R., Morgan, M. G., Bruine de Bruin, W., & Keith, D. W. (2004). Initial public perceptions of deep geological and oceanic disposal of carbon dioxide. Environmental Science & Technology, 38(24), 6441–6450. (Two surveys of the general public.)
Semenza, J. C., Ploubidis, G. B., & George, L. A. (2011). Climate change and climate variability: personal motivation for adaptation and mitigation. Environmental Health, 10(1), 46. (This was a phone survey of the general public.)
Héguy, L., Garneau, M., Goldberg, M. S., Raphoz, M., Guay, F., & Valois, M.-F. (2008). Associations between grass and weed pollen and emergency department visits for asthma among children in Montreal.Environmental Research, 106(2), 203–211. (They mention in passing that there are some future climate scenarios predicting an increase in pollen, but their paper has nothing to do with that. It's just medical researchers talking about asthma and ER visits in Montreal, in the present. They don't link their findings to past or present climate change at all (in their abstract), and they never mention human-caused climate change – not that it would matter if they did.)
Lewis, S. (1994). An opinion on the global impact of meat consumption. The American Journal of Clinical Nutrition, 59(5), 1099S–1102S. (Just what it sounds like.)
De Boer, I. J. (2003). Environmental impact assessment of conventional and organic milk production.Livestock Production Science, 80(1), 69–77
Acker, R. H., & Kammen, D. M. (1996). The quiet (energy) revolution: analysing the dissemination of photovoltaic power systems in Kenya. Energy Policy, 24(1), 81–111. (This is about the "dissemination" of physical objects, presumably PV power systems in Kenya. To illustrate the issue here, if I went out and analyzed the adoption of PV power systems in Arizona, or of LED lighting in Lillehammer, my report would not be scientific evidence of anthropogenic climate change, or admissable into a meaningful climate consensus. Concretize it: Imagine a Mexican walking around counting solar panels, obtaining sales data, typing in MS Word, and e-mailing the result to Energy Policy. What just happened? Nothing relevant to a climate consensus.)
Vandenplas, P. E. (1998). Reflections on the past and future of fusion and plasma physics research.Plasma Physics and Controlled Fusion, 40(8A), A77. (This is a pitch for public funding of the ITER tokamak reactor, and compares it to the old INTOR. For example, we learn that the major radius of INTOR was 5.2 m, while ITER is 8.12 m. I've never liked the funding conflict-of-interest argument against the AGW consensus, but this is an obvious case. The abstract closes with "It is our deep moral obligation to convince the public at large of the enormous promise and urgency of controlled thermonuclear fusion as a safe, environmentally friendly and inexhaustible energy source." I love the ITER, but this paper has nothing to do with climate science.)
Gökçek, M., Erdem, H. H., & Bayülken, A. (2007). A techno-economical evaluation for installation of suitable wind energy plants in Western Marmara, Turkey. Energy, Exploration & Exploitation, 25(6), 407–427. (This is a set of cost estimates for windmill installations in Turkey.)
Gampe, F. (2004). Space technologies for the building sector. Esa Bulletin, 118, 40–46. (This is magazine article – a magazine published by the European Space Agency. Given that the ESA calls it a magazine, it's unlikely to be peer-reviewed, and it's not a climate paper of any kind – after making the obligatory comments about climate change, it proceeds to its actual topic, which is applying space vehicle technology to building design.)
Ha-Duong, M. (2008). Hierarchical fusion of expert opinions in the Transferable Belief Model, application to climate sensitivity. International Journal of Approximate Reasoning, 49(3), 555–574. (The TBM is a theory of evidence and in some sense a social science theory – JDM applied to situations where the stipulated outcomes are not exhaustive, and thus where the probability of the empty set is not zero. This paper uses a dataset (Morgan & Keith, 1995) that consists of interviews with 16 scientists in 1995, and applies TBM to that data. On the one hand, it's a consensus paper (though dated and small-sampled), and would therefore not count. A consensus paper can't include other consensus papers – circular. On the other hand, it purports to estimate of the plausible range of climate sensitivity, using the TBM, which could make it a substantive climate science paper. This is ultimately moot given everything else that happened here, but I'd exclude it from a valid study, given that it's not primary evidence, and the age of the source data. (I'm not sure if Ha-Duong is talking about TCS or ECS, but I think it's ECS.))
Douglas, J. (1995). Global climate research: Informing the decision process. EPRI Journal. (This is an industry newsletter essay – the Electric Power Research Institute. It has no abstract, which would make it impossible for the Cook crew to rate it. It also pervasively highlights downward revisions of warming and sea level rise estimates, touts Nordhaus' work, and stresses the uncertainties – everything you'd expect from an industry group. For example: "A nagging problem for policy-makers as they consider the potential costs and impacts of climate change is that the predictions of change made by various models often do not agree." In any case, this isn't a climate paper, or peer-reviewed, and it has no abstract. They counted it as Implicit Endorsement – Mitigation. (They didn't have the author listed in their post-publication database, so you won't find it with an author search.))
Original post below:
Ignore them completely – that's your safest bet right now. Most of these studies use political activists as the raters, activists who desired a specific outcome for the studies (to report the highest consensus figure possible), and who sometimes collaborated with each other in their rating decisions. All of this makes these studies completely invalid and untrustworthy (and by customary scientific standards, completely unpublishable.) I had no idea this was happening. This is garbage, and a crisis. It needs to stop, and those papers need to be retracted immediately, especially Cook, et al (2013), as we now have evidence of explicit bias and corruption on the part of the raters. (If that evidence emerged during the actual coding period, it would be fraud.)
PAUSE BUTTON: This issue has nothing to do with the reality of the consensus, a reality that was evident before this political operation/study unfolded. I am not a "denier", or even a skeptic. I don't know enough, or have an argument that would lead me to be, even a "lukewarmer". There are 7 billion people on this earth, and we're not all sorted into good people and deniers. I'm quite confident that there's a consensus – a large majority – of climate scientists who endorse both that the earth has warmed over the last 60+ years, and that human activity caused most of it. The warming itself is a descriptive fact, not a theory or inference. I'd be quite surprised, amazed, if the basic theory of anthropogenic forcing as a principal cause turned out to be false, and somewhat surprised if AGW turns out to be mild, like 1° C. (Unfortunately, there is little research on scientists' views on the likely severity of future warming. A consensus only that humans have caused warming, a consensus so vague and broad, is not very useful. The Cook study would be unhelpful even if it were valid, which it is not.)
Back to the program...
In social science, it's not uncommon to use trained human raters to subjectively rate or score some variable — it can be children's behavior on a playground, interviews of all kinds, and often written material, like participants' accounts of a past emotional experience. And we have a number of analytical and statistical tools that go with such rating studies. But we would never use human raters who have an obvious bias with respect to the subject of their ratings, who desire a specific outcome for the study, and who would be able to deliver that outcome via their ratings. That's completely nuts. It's so egregious that I don't think it even occurs to us as something to look out for. It never happens. At least I've never heard of it happening. There would be no point in running such a study, since it would be dismissed out of hand and lead to serious questions about your ethics.
But it's happening in climate science. Sort of. These junk studies are being published in climate science journals, which are probably not well-equipped to evaluate what are ultimately social science studies (in method). And I assume the journals weren't aware that these studies used political activists as raters.
Examples of the unbelievable bias and transparent motives of the raters' in Cook, et al (2013) below. These are excerpts from an online forum where the raters apparently collaborated with each other in their ratings. It's worse than that – the first example is evidence of fraud if this was during the operational rating period. If it was during training, it's practice for fraud.
"BTW, this was the only time I "cheated" by looking at the whole paper. I was mystified by the ambiguity of the abstract, with the author wanting his skeptical cake and eating it too. I thought, "that smells like Lindzen" and had to peek."
Let's look at how the paper described their method: "Abstracts were randomly distributed via a web-based system to raters with only the title and abstract visible. All other information such as author names and affiliations, journal and publishing date were hidden."
Hence the fraud issue. Next example:
"Man, I think you guys are being way too conservative. Papers that talk about other GHGs causing warming are saying that human GHG emissions cause global warming. How is that not an implicit endorsement? If CFC emissions cause warming because they're GHGs, then CO2 emissions cause global warming for the same reason. That's an implicit endorsement."
One wonders if a passing bird counts as implicit evidence of the consensus. This is what we call a nonfalsifiable hypothesis.
If this was the live coding period, this is a joke. A sad, ridiculous joke. And it's exactly what you'd expect from raters who are political activists on the subject they're rating. Who in their right minds would use political climate activists as raters for a serious report on the consensus? This is so nuts that I still have a hard time believing it actually happened, that the famous 97% paper was just a bunch of activists rating abstracts. I've called on the journal – Environmental Research Letters – to retract this paper. I'm deeply, deeply confused how this happened. If this is what we're doing, we should just call it a day and go home – we can't trust journals and science organizations on this topic if they're going to pull stunts like this.
Moreover the raters weren't generally scientists, much less climate scientists. One of the raters is a former bike messenger who founded Timbuk2, a company that makes great bags (Rob Honeycutt.) I've got mad props for him for what he's done with Timbuk2 – for anyone who starts their own business and follows their vision. That's very hard to do. But I'm not going to want luggage entrepreneurs to be rating climate studies or interpreting science for the world. I'll buy you a beer any day of the week Rob, but I just can't sign off on this.
Other raters are just bloggers. I don't mean scientists who blog. I just mean bloggers, who are not scientists. Nothing against bloggers – I'm just not feeling that, don't need bloggers to be rating climate science abstracts. Another rater is only identified by an anonymous username – logicman. Who can argue with logicman? Is there a big L on his uniform? Where's emotionman been lately? What's fallacygirl up to? Anyway, probably no one needs to be subjectively rating climate abstracts, but if anyone did, it would have to be climate scientists. Is this controversial in some cultures?
More importantly, I don't care who you are – even if you're a staunch liberal, deeply concerned about the environment and the risks of future warming, this isn't something you should tolerate. If we're going to have a civilization, if we're going to have science, some things need to be non-political, some basic rules need to apply to everyone. I hope we can all agree that we can't seriously estimate the AGW consensus by having political activists rate climate paper abstracts. It doesn't matter whether the activists come from the Heritage Foundation or the Sierra Club, Timbuk2 or Eagle Creek – people with a vested ideological interest in the outcome simply can't be raters.
Also note that anyone who wants to defend this nonsense, who wants to argue that it's fine for political activists to subjectively rate science abstracts – which they won't be qualified to even understand – on issues central to their political activism, needs to also accept the same method when executed by partisans on the other side. If Heartland gathers a bunch of activists to read abstracts and tell us what they mean, all the Cook defenders need to soberly include the Heartland study. The AAAS needs to include the Heartland study in their reports, including it in their average (they didn't do an average, just cherry-picked junk studies.) If a squad of Mormons reads the abstracts of a bunch of studies on the effects of gay marriage, and sends their ratings to a journal, Cook defenders should be cool with that, and should count it as knowledge about the consensus on gay marriage.
Of course, these scenarios would suck. This method perverts the burden – it allows any group of hacks to present their subjective "data", putting the burden on us, on everyone else, to do a bunch of work to validate their ratings. We should never be interested in studies based on activists reading and rating abstracts – it's a road we don't want to travel. Researchers normally get their data by observation – they don't create it, not normally.
We don't need random people to interpret climate science for us, to infer the meaning of abstracts, to tell us what scientists think. That's an awful method – extremely vulnerable to bias, noise, incompetence, and poor execution. The abstracts for many papers won't even have the information such studies are looking for, and are simply not written at the level of abstraction of "this study provides support for human-caused warming", or "this study rejects human-caused warming". Most climate science papers are written at a more granular and technical level, are appropriately scientifically modest, and are not meant to be political chess pieces.
(Updated paragraph: I had incorrectly suggested that they asked authors to self-rate their abstracts, just as Cook's raters did, when in fact they asked them to rate their papers. The failure to hold that variable constant complicates things, but admittedly it would be very difficult for authors to strictly rate an abstract, as opposed to the whole paper they wrote. None of this matters anymore given the much larger issues that have emerged.) There's a much better method for finding out what scientists think — ask them. Direct surveys of scientists, with more useful and specific questions, is a much more valid method than having ragtag teams of unqualified political activists divine the meanings of thousands of abstracts. Interestingly, but not surprisingly, direct survey studies tend to report smaller consensus figures than the abstract rating studies (I'll have more on that later.) The consensus will be strong regardless, so it's especially confusing why people feel the need to inflate it.
In the second part of their study, Cook et al surveyed authors of the papers in their dataset – that's not at all the way to survey climate scientists, since their paper search seems to have bizarre and unexplained results, e.g. it excluded everything Richard Lindzen published after 1997. Their pool of authors is invalid if we don't know whether the search had some selection biases. It's an arbitrary pool – they'd need to validate that search and its results before we could trust it, and they should've done that at the outset. And the fact that they included psychologists, social scientists, pollsters, engineers and other non-climate science or even non-natural sciences in the 97% (as endorsement) makes their survey of authors moot.
(For subjective ratings of abstracts to be a valid and useful method, it would need to be a carefully selected pool of raters, without ideological agendas, implementing a very specific and innovative method, under strict procedures of independence. I can imagine philosophy of science questions that might be anwerable by such methods, based on things like the usage of certain kinds of words, the way hypotheses are framed and results reported, etc. – but much of that could be done by computers. The studies that have been published are nothing like this, and have no hope of being valid.)
NOTE: The Cook, et al data was leaked or hacked a few months ago – I'm confused by what's going on here. Cook allegedly wouldn't release some of his data, and ultimately a bunch of data was hacked or scraped off a server, and it included the raters' online discussion forum. Climate science features far too many stories of people refusing to release their data, and mysteriously hacked data. Brandon Shollenberger has posted the data online. It's amazing that if it weren't for him, we wouldn't know how sketchy the study truly was. There's much more to report – the issues raised by the leaked dataset extend far beyond the quotes above and rater bias.
The University of Queensland has apparently threatened to sue Shollenberger, on some sort of "intellectual property" grounds. Australia is one of my favorite countries, but we need to stand up for him. To the best of my knowledge, he hasn't done anything wrong – he hasn't posted any sort of sensitive information or anything that would violate our core principles of scientific ethics. The identities of the raters were not confidential to begin with, so there was no new disclosure there. He's exposed the cartoonish bias and corruption of the rating process that underlied this "study", and in so doing, he's served the interests of scientific ethics, not violated them.
Even if those online discussions took place during the training period, it would still be alarming evidence of bias, but other evidence suggests this was not a training period. I've never heard anyone call scientific data "intellectual property" before – that's an interesting legal theory, since this is not about an invention or an original creative work. If scientists were to get into the habit of treating data as IP, or otherwise proprietary, it would impair scientific progress and quality control – it would also violate the basic premise behind peer review. Shollenberger's disclosures took place in a context where the authors apparently refused to release all of their data, so I'm not sure what other options there were for him. In other words, he's a whistleblower. You can contact the research governance people at the University of Queensland here (scroll to the bottom of that page).
Update: In their legal threat letter, the University of Queensland says that the letter itself is intellectual property, and that publication of the letter is cause for separate legal action. What? That's like an NSL. Is this new? What kind of upside-down bizarro world is this? You can send someone a threat letter, copyright the letter, and force them not to disclose it? This is unbelievably creepy.
Update 2: Political activism is not a vice. I'm not saying it's a vice. If you think the left, or right, or libertarian, or Rastafarian perspective is true, do your thing. People have the right to be left-wing activists, conservative activists, environmental activists, wherever their minds and their values have taken them. I'm a pro-immigration activist sometimes. But I will never be a subjective rater of textual material in a study whose outcome would potentially serve my pro-immigration cause, especially if my ratings could deliver that outcome, nor will I ever assemble a team of pro-immigration activists to perform such ratings. Are we being serious right now? This is ridiculous. We can't do that. Do we want to call what we do science? This shouldn't be hard.
José L. Duarte
Social Psychology, Scientific Validity, and Research Methods.