How data can be overrated

8/18/2014

This is something I wrote some time ago, and have quickly updated. I'll have a methods paper soon that digs into these issues in much greater detail.

Data is overrated. This is a terrible, terrible thing for a scientist to say. Well, except that it’s true, so it’s a good thing for a scientist to say.

I don’t question, remotely, the central empirical premise of social science or any other science. Social science requires the rigorous collection of data, and always will. This is good and fine, and often thrilling and wonderful. I mean something different.

I think there's a problematic epistemology, particularly in social psychology, that doesn’t distinguish between the kinds of claims that require data and those that don’t. There’s a broad lack of awareness of the epistemic authority of logic – that we can know some very important things through valid reasoning. This has consequences for the validity of research, and what happens after invalid research is published. We've somehow combined rationalism and empiricism, two sides of a classic dichotomy in philosophy. We're rationalistic in our empiricism. Let me explain...

Category 1

Example 1

In 2011, I pointed out that Napier and Jost (2008) is invalid. In that paper, they wanted to explain why conservatives are happier than liberals, and asserted that it was because conservatives “rationalize inequality." Strangely, they never measured rationalization of inequality. I don’t mean that they used a measure that I don’t like – I mean they never measured it at all (and they didn’t collect any data, just used public datasets like the National Election Survey and relabeled the variables.) In one study, they took answers to the question of whether hard work tends to pay off in the long run – just that one item – and called it “rationalization of inequality”. In another study, they took a 6-item measure of simple attitudes toward inequality, and called it, again, “rationalization of inequality”. They converted an attitudes measure into a measure of rationalization of those same attitudes – I’ve read a lot of papers, but I’ve never seen a social psychologist do that.

The studies are invalid. They didn’t measure the construct. Rationalization is a cognitive process, one that social psychologists are quite familiar with, dating to classic research on cognitive dissonance. We can’t just assert that people are rationalizing something. We can’t ask people whether they agree with our political views, and call it rationalization if they don’t.

I argued that the paper was completely invalid, that it carries no knowledge pertinent to its claims or hypotheses. (The authors or journal should retract it.) I received lots of supportive e-mails from researchers all over the world, but I was surprised when a couple of them said that my debunking would be even more convincing “with data.” Hold that thought…

Example 2

I recently wrote about the Lewandowsky, Oberauer, and Gignac (2013) scam (after lots of other people did.) In their title, they linked moon-landing hoaxism with climate science hoaxism, even gave it a causal direction. There was no such relationship. One reason is that virtually no one in the study endorsed the moon-landing hoax – only 10 out of 1145 participants, the lowest level of endorsement anyone has ever reported, to my knowledge. We can’t link moon hoaxism to anything if only 10 of 1145 endorse it (0.8%). Even worse, 7 of those 10 rejected the climate hoax idea, so going with the causal direction in the title, if one believes the moon landing was a hoax, one is unlikely to believe that climate science is a hoax. And if we run a logistic regression – the proper analysis for wildly skewed and substantively dichotomous data, not a Pearson correlation – with moon hoaxism predicting climate hoaxism, there is no effect. However, I don't think any analysis is appropriate when only 10 people endorse the moon hoax – 0.8% is less than the sampling error for the survey, so it's essentially zero. In any case, if we lived in a universe where we could make inferences from 10 people in a sample of 1145, and the titles of our papers were based on 0.8% of our participants, the title here would have to be the opposite, e.g. NASA Faked the Moon Landing, Therefore (Climate) Science is True.

(Ethics Pause: Think about how you would relate to someone if you thought they believed the moon landing was a hoax. Or if they disputed that HIV causes AIDS? (11 people in their sample, from which they asserted other effects...) Would you hire them? Would you date them? What Lewandowsky, Oberauer, and Gignac is a disgrace.)

They made similarly false claims in the abstract and body of the paper, linking people to beliefs that virtually no one in their sample endorsed – numbers far too trivial (11 or 16 people out of the 1145) to be linked to the mainstream views to which they tried to link them. It was a scam. They abused linear correlation statistics driven by variance between people choosing Strongly Agree vs Agree, and told the world that people “rejected” claims that they in fact agreed with.

Once more, a scientist who agreed with me also said that my debunking would be even more convincing “with data.”

What This Is

This is recurrent, and there are many examples of this mindset. It’s a profound epistemological and scientific error.

The examples above are cases where the research is invalid. In the Lewandowsky case, we might just say their claims are false, since they actually measured the variables they talked about, so we can look and see that the claims are false. (There we’d probably say that their analyses were invalid and some of their claims false and unconscionable.)

When people say that “data” would help make a case against these papers, it tells me that they don’t know what I mean when I say that a study is invalid, and I think I mean validity in the normal sense.

I think it comes to a couple of problems in how we as a field understand knowledge as such.

First, a fundamentally invalid study does not represent knowledge. It carries no findings. To say that a study is invalid normally means that the study's method is not a legitimate method of knowing about the things the study wanted to know about. Social psychologists are granting residual epistemic standing to published work, even after it's been shown to be invalid, even if they agree that it's invalid. This might be a crude social proof thing – people might think that if a paper is published it must have some legitimacy or validity. This belief will be false across all scientific fields, and I think it's a very vulnerable assumption to have about social science papers in the current era.

But there’s something else here. Say we have papers like the ones above. Then someone comes along and argues that they’re invalid, or even false by their own data. What sort of knowledge claims are these?

They're claims about the validity of research, or about the data behind claims the authors made. These kinds of claims do not require new data. Let’s linger on this...

Let’s say I measure sunshine and call it self-esteem.

You observe that I measured sunshine and called it self-esteem. You then say, hopefully very stridently, that my study is invalid, as are any claims I made.

What determines whether you argument is true? Whether I indeed measured sunshine and called it self-esteem. That’s what will determine the truth of it.

How would (new) data help your argument? Remember the claim is that I measured sunshine and called it self-esteem. What new data could you collect to further establish that I did so?

There is none. Either I measured sunshine and called it self-esteem, or I did not. If I did, my study is invalid. It doesn’t matter if I got it published in Psychological Science. It’s still invalid. It doesn’t matter if other people cited it, if it helped get me tenure, or if Chris Mooney touted my findings. None of that matters. My study is fundamentally invalid, which makes it meaningless. It does not represent knowledge about self-esteem, does not carry data about the construct. It’s vacated, as a matter of scientific knowledge, regardless of whether I have the integrity to retract it.

Is this clear? The nature of a validity argument is that it rests on what happened in the study – it’s about features of that study. It’s not about the empirical hypotheses. New data is not necessary, and often won’t even strengthen the argument. There are exceptions, but not the examples here. The burden is always and entirely on the researchers to support their claims – it's never on us to go out and collect data to refute an invalid study.

In the Lewandowsky case, what would we refute with data? The claims are false by their own data. That's the claim being made. When we say “10 of 1145 participants were moon hoaxists, and most were not climate hoaxists”, that’s a claim grounded in observations about their data. We don’t need any data of our own to observe their lack of data.

We’re granting an inappropriate privilege to papers and findings just because they’re published. I think some researchers haven’t fully wrapped their heads around the fact that a paper, or a set of studies, can simply be false, or invalid, or both. We need to have those categories, because they describe recurrent phenomena in any science. We don’t have a ready schema for "false paper" or "invalid paper", outside of traditional fraud cases. In the humanities, fields without data in the normal sense, the worst that can happen to a paper (absent fraud) is that it be forgotten, or superseded. There’s no “false” category. The only way to get retracted is plagiarism or fraud in one’s sources. In science, things can be false. Things can be invalid. Papers are retracted all the time, often by their own authors, for non-fraud reasons – for having made a mistake, being false or invalid. But not in research psychology. I never see people retract their papers when someone points out that they’re false or invalid. That’s not a sustainable situation. We can’t be like the humanities in that respect – we need to be like the sciences.

I’d like to zoom back a bit and address another branch of this issue of rationalistic empiricism, and the lack of awareness of the authority of valid reasoning. The two examples here will continue to serve us. In both cases, the research was doomed before it began.

Category 2

Sometimes social scientists act as though we can’t know anything unless it’s published in one of our journals, comes with a p-value, etc. Now, I don’t think anyone would literally argue this if pressed – it’s just that they implicitly embrace something like this view much of the time. In reality, there is lots of valid knowledge outside of social psychology – the world abounds with knowledge, and sometimes knowledge and data outside our fields will have logical implications for our hypotheses.

For the N&J study, someone might have observed that since conservatives don’t necessarily object to economic inequality (neither do libertarians), they probably can't rationalize it. To rationalize something, it has to be objectionable, and we can't assume that income inequality is objectionable, either in the mind of a participant or as a descriptive fact about the world (the researchers seemed to assume that it was obviously objectionable.)

The Oxford English Dictionary is paywalled, so here's the definition from Dictionary.com: to ascribe (one's acts, opinions, etc.) to causes that superficially seem reasonable and valid but that actually are unrelated to the true, possibly unconscious and often less creditable or agreeable causes.

We definitely can't assume conservatives are doing that on income inequality – that would be biased and question-begging. So it’s not clear that the construct “rationalization of inequality” has any valid application to conservatives or other non-liberals, just as the "rationalization of abortion" would not be valid applied to pro-choice liberals and libertarians. It's a biased, question-begging scam to define "rationalization" around the researcher's problem with a social reality. I’m sure you could induce defensiveness in conservatives by showing them pictures of homeless people or something, but you could induce defensiveness in any camp, on any issue. I’m not sure that it would interesting or meaningful.

(Note that their hypothesis on why conservatives are happier than liberals was refuted later by Schlenker, Chambers, and Le (2012). Their paper was not debunked – there was never anything to debunk since they didn't measure their construct, and therefore did not carry any knowledge about the hypothesis.)

Similarly, in the Lewandowsky case, we already knew, in advance, that climate skepticism isn’t going to be driven by beliefs like moon-landing hoaxism. We have survey data from professional pollsters. We know that very few people believe the moon landing was a hoax. And we know that a much larger number of people are skeptical of climate science. A recent survey showed that 7% of people believed the moon landing hoax, and 37% thought climate science was a hoax. Best case, 19% of climate hoaxists are moon-landing hoaxists (7/37). (It will be lower, because they won’t completely overlap.) If the vast majority of climate hoaxists/skeptics are not moon-landing hoaxists, which we already know from the polls, there’s no point trying to link them. Unless you’re trying to get at a specific cognitive process (which you won’t do with Lewandowsky’s crude surveys) or a subcategory of climate skeptics, there’s no point. It’s scientifically specious, and I think unethical, to draw links between a group and some far-out, damaging belief when the vast majority of the group simply does not hold that belief.

So in both cases here, reality constrains our research, our hypotheses, in ways that are not readily acknowledged in the field. We’re not granting logic and exogenous facts the authority that they truly have. Our research can be rendered meaningless, or deeply flawed, by facts available outside the field, facts that do not count as “data” by our normal reckoning, in concert with plain logic.

Category 3

The third category is how we deal with the validity and reliability of scales (surveys). I’ve pointed out to researchers that the Social Dominance Orientation (SDO) scale might be invalid, only to have them respond “But it’s very reliable.” Reliability has nothing to do with validity (well, it actually does, but not for our purposes here.) But reliability comes as a number. As we’ve noted, people like numbers, data. Reliability can be expressed as a Cronbach alpha, or better yet, McDonald’s omega. Validity doesn't come as a number, as “data” (at least not in simple form, in one statistic.) Validity often requires logic, substantive reasoning about what things mean. The SDO is what I call a Caricature Scale. I’ll dig into this much more in an upcoming methods paper, but a Caricature Scale has specific features. Ultimately, they’re traps. They’re very reliable traps, in more ways than one, but they’re not generally valid, and are artifacts of a primitive early stage of social science. The SDO, for example, is written purely in the language of the American academic left (circa 1990.) It’s groups this and groups that, full of caricatured and straw man versions of vaguely conservative positions. It traps conservatives into endorsing the midpoint, perhaps out of confusion, while liberals endorse the low end. Voila! Conservatism is correlated with “Social Dominance Orientation”, even though many or most of them don’t actually endorse the items.

In summary, scientists need to be capable epistemologists. No data is inherently meaningful, and it’s quite easy for it to be meaningless. People are looking at r's and p's, without looking at the data and considering what it means. In our "data, data, data!" mindset, we're not paying enough attention to meaning, and ultimately, to reality. We need to understand that validity is decisively important, that invalid science is ultimately not science at all. We need to understand the epistemic authority and utility of logic and reasoning. We need to be more firmly seated in the sciences, as opposed to the humanities. We need to understand that many published papers do not represent or carry knowledge, at least not the knowledge claimed, and some papers are just false. We need to join our scientific cousins in other fields, and account for these realities by retracting such papers when we discover them. If we don’t, we risk accruing a large number of false claims and findings, which would strain our standing as a science.

* A positive "correlation" between the two variables in the Lewandowsky title has no meaning if only 10 people endorsed the moon hoax, and most of them rejected the climate hoax idea, especially given the explicit causal direction presented in the title, where we start with the moon hoax and look at the probability of endorsing the climate hoax. There we can just look at the probability, which is 0.3, making the title false. The valid analysis for these near-dichotomous items will normally be a logistic regression (see the more detailed posts below), which is not significant here (but not valid here because there is no data -- 10 people will be 0 for our purposes most of the time, but it's definitely 0 when we're talking about a wide open, sloppy online study with minors and fakes.) A stark 4-point scale of Disagreement and Agreement is not a continuous variable, and this will elaborated in the literature soon. Agreeing with something is very different from disagreeing with something, and agreeing that the moon landing was a hoax is very, very different from disagreeing that the moon landing was a hoax. So if we only offer the four responses of Strongly Disagree, Disagree, Agree, and Strongly Agree, we cannot leverage variance between Strongly Disagree and Disagree (which is where 1135 out of 1145 participants were) to generate a linear correlation between the two items and talk as though agreement on the moon predicts agreement on the other -- there's no real agreement here. The correlation is actually between those who disagree with the one and disagree with the other. Call this Category 4 of where "data" can be overrated or misused. Correlation statistics sometimes have no meaning.

2 Comments

Brad Keyes link

9/17/2014 12:07:28 am

I'm no Lewandowsky apologist, but in 100% fairness I don't think you can blame him for the title. It would've taken a lot of red tape to change the title AFTER he saw the data. It was already written on his grant application, which is standard procedure in climate psychology, as you would know if you weren't commenting so far out of your depth!

The flippant sarcasm of that entire paragraph illustrates the kind of thing I *won't* write anymore in your comments, Joe, just to avoid misunderstandings—good idea?

Excellent article though, as per usual.

Popsicle Accessories link

5/30/2023 09:59:04 pm

Very thougghtful blog

How data can be overrated

Leave a Reply.

José L. Duarte

Archives

Categories