Marquette's decision to strip Professor John McAdams of tenure is a disgrace and may influence my long-term plans with respect to the intellectually craven environment that is American academia.
A key pivot point for this story is that a student in a philosophy class was concerned about the tolerance for dissent in that class on the issue of gay marriage. He spoke with the instructor, a graduate TA. She ultimately told him that "homophobic" views would not be tolerated, and added the obligatory pre-emptive censure of "racist" and "sexist" views. (Professor McAdams criticized the teacher's handling of this episode in a blog post, and also criticized prior hyper-PC behavior by the Assistant Dean and the Philosophy Department Chair, which is what I believe actually did him in.)
The student had not expressed homophobic views and seemed articulate. His only specific comment seemed to be the idea that there is empirical evidence of bad outcomes for children of gay couples.
I was struck by the teacher's robotic injunctions against the recurrent and seemingly empty phoneme strings: racism, sexism, and homophobia.
We can't have any of that stuff, cause it's bad m'kay?
I'd like to make this clear. I want to hear racist, sexist, and homophobic ideas. I think you should want to hear them too. I mean it, and I mean it at two levels of analysis.
Level 1: I most definitely want to hear ideas that academic leftists label as racist, sexist, and homophobic.
I want this because experience tells us that the ideologies and theories academic leftists apply in tagging discourse with those labels will be unsatisfying to many scholars on counts of rigor, merit, and framing assumptions. Similarly, we know that the vast majority of the earth's population will not consider those discourses and ideas to be racist, sexist, or homophobic, which might suggest fertile ground.
So we most definitely need to hear, discuss, test, and debate those ideas given that their prohibition by a the current ideological spasms of American academia should be of no interest to us. We should brush away this unscholarly screaming without even slowing down.
Level 2 breaks into near isomorphic variants. For my purposes here I'm basically aggregating the three variants.
Level 2a: I want to hear ideas that I would consider racist, sexist, or homophobic.
Level 2b: I want to hear ideas that most people who are not American academics would consider racist, sexist, or homophobic.
Level 2c: I want to hear ideas that are in fact racist, sexist, or homophobic.
First, I have no reason to suppose – to decide in advance – that nothing can be learned, no insights gained, by hearing and perhaps engaging or interrogating robustly racist, sexist, and homophobic ideas. I'm surprised people would just assume that nothing can be gained. This seems unlikely to me, and I would place the burden on those people to prove that nothing can be gained. The null here, or the default intellectual/scholar mode, is the mode of engagement, exposure, listening, considering, weighing, arguing, etc.
Second, I have no reason to suppose – to decide in advance – that there will be no merit or wisdom in canons of racist, sexist, and homophobic thought. That would be strange to me, seems unlikely given what we know about the scope and texture of human discourse, what we know about history, about human psychology, about the marginalization of ideas and peoples, and so forth. I wouldn't be shocked to find a diamond in the rough, a speck of gold in the prospector's pan. I imagine these wisps or chunks of truth or wisdom will be orthogonal to the broader racist, sexist, or homophobic systems to which they belong, but I don't want to pin too much on that assumption. I want to go in clean.
Third, I freely embrace a notion that would perhaps not be controversial in brighter eras. I think it's possible that I'm wrong. I think it's possible that I'm wrong about lots of things. I think it's possible that I'm wrong in the beliefs in which I'm most certain. It surprises me that today's scholars don't seem to imagine a universe where they are substantially wrong.
This also means you might be wrong, any of us might be wrong. I might be wrong, in some sense, about racism, or some subset of it, or some other dimension of it that I can't foresee. What would that look like? Well, I'm never going to be a racist in the sense of malevolence or hate toward humans because of their race. It's not realistic that I could change that much. It's like imagining I was raised by a different set of parents – I wouldn't be me anymore.
There are levels of analysis with respect to racism, sexism, and homophobia that are philosophical. That's where philosophers go to work. Then are levels of analysis that are empirical, things we can measure, things we can go find out and come back with the answer. That's where scientists go to work. There may be things racists, sexists, and homophobes believe that are simply true as an empirical matter.
For example, there might be stable innate differences in IQ between various racial-ethnic groups. Here I tend to bring the same cautious attitude I invoke above – I think it's nuts for anyone to be settling on the answer to that issue right now, especially racists. It's way too early. Give it another fifty years at least. There are a host of complex cultural and environmental issues that may be in play, including ones we don't know about. There are many known unknowns and unknown unknowns here, and I really don't like jumping the gun. There's no reason to assume that any question we have can be readily answered by some dude in a lab coat in the era in which we live. That's just not going to work out.
So while I don't think this can be settled in the near future, I am open to the idea that there are innate differences and that we will know this in 2060 or 2090. These differences may even be unfavorable to my group, Mexicans, Native Americans, whatever we'd call the brown or the genetic substrate in my case. I do not assume that these sorts of differences matter, or will matter, or that people have to care about them, or that it is rational or ethical to employ heuristics based on them in day-to-day life. Also, I don't assume there will be such differences. I have no idea. Reality is a very complicated place. They might be faint. Who knows, but what to do with that kind of reality is a philosophical question. We also have to remember variance and how that works, get people trained up to not focus on mean differences.
It's the same with the outcomes for children of gay couples. I doubt there's much of an effect there, nor do I assume anyone has to care about such an effect, but I wouldn't want to shut down that conversation. The student in this case was wrong in saying such children "do a lot worse in life." I've not seen large effects, not the last time I checked. The teacher was wildly incompetent in responding to the student's arguments, shifting the issue to single people having kids or adopting or something. That was so lazy and invalid. It's also not a fruitful path because we know that the number of children born out of wedlock has exploded, and we know that there are indeed very real consequences for those children. We know a lot about that.
Social Science Qualifier: From experience, I know people don't come in with the assumptions about probabilistic truths that social scientists take for granted. When I say "those children", I am speaking of statistical effects that rest on mean differences and differences in variance detected by inferential statistical methods. Call it averages. I am not saying anything about you. I'm not saying anything about your family, your background, your parent(s), or your friends. Any given single-parent household might be the best, most loving environment possible in our civilization. Any given single-parent household might send kid after kid to Harvard. There is plenty of room for variance. It's the aggregate reality that is at issue here, the net effect. You can assume that it's a bigger problem for lower income contexts, rather than Murphy Brown affluent professional single mother situations. There do appear to be father-specific benefits and all sorts of interactions, but this is always about aggregates, not any one context.
What we do with statistics about children of gay couples or children born out of wedlock is a whole different journey than the empirical journey we just shared. This is philosopher-hat business. You get to decide whether or how to use these empirical findings, how to situate them in a broader context or a political platform. Let many flowers bloom. We need many voices. I for one am not going to stop supporting gay marriage if it turns out that children of gay parents have 4-point lower average SAT scores than children of straight Catholics. See what I mean? An effect doesn't imply a political position. I'm not a utilitarian, certainly not a knee-jerk rationalistic drowning in data utilitarian. (I do think we have a real problem with children born out of wedlock. The conservatives were simply right in their intuitions about how that would impact children and family life. You should give them props on that one.)
Back to Marquette, the bastards. I'm worried about scholarship in our time. Do people have no sense of history? Do they have no sense of the grand sweep and our place in it? Do they have no sense of the enormous range and complexity of fruitful inquiry and scholarship? Are they really that small that their ideology reduces to empty phoneme strings and incantations about racism and all the other baddies? Do they not understand that some or all of their ideological tenets can be disputed by capable, rational, and benevolent thinkers? Do they really not consider the possibility that they may have gotten something wrong? What the hell are these people doing in a university?
Do they not realize that modern American academia is a culture, and that their culture is going to yield a different set of experiences and insights than people who come from other cultures, or even twenty miles west? Why would their culture be better than all the others, with a complete package of The Truth over the Iowa farmers, the Montana ranchers, the Brooklyn ballers, the Phoenix suburbanites, or the church choir?
I hate cowardice, and these are cowards. Cowardice will have some implications for the quality of scholarship and how much longer we can keep the lights on. What bastards. Give Professor McAdams his damn chair back.
I was struck by this quote from a Forbes piece on the secondhand smoke research.
"there’s no such thing as borderline statistical significance. It’s either significant or it’s not."
It's attributed to a journalist named Christopher Snowdon (I don't know who that is.)
It's false, and I think it's important for us to convey a clearer message to the public about what statistical significance is.
tl;dr: It's a business decision, and by the way, how many fingers do you have? (thumbs included...)
Significance is not a binary or discrete property of a scientific finding. Our convention in social science, and I think in lots of biomedical fields, is the .05 threshold. I'll return to this.
The statistical significance of an effect is the likelihood of drawing a random sample with the measured characteristics of our sample if the null hypothesis is true. Note that this is not the same thing as saying the likelihood of our research hypothesis being true is 1 – p, or 95% or greater given our standard .05 threshold. Significance is often mis-explained as the inverse likelihood of our hypothesis being true. That's not what it means. And there are other assumptions, particularly regarding normal distributions, that will impact the meaning of all of this.
By measured characteristics, I mean a sample that looks like our sample in the study. So if we conduct a longitudinal study with a large sample of women and track them on variables like lung cancer and passive smoking, we end up with X% having lung cancer, Y% having lived in home with a smoker, Z% who have lung cancer and lived with a smoker, and variance on other variables like length of time a person has lived with a smoker, age, race, lifestyle, etc.
The null hypothesis is that passive smoking does not cause lung cancer in nonsmokers – that there is no relationship between these variables (that would be one of several hypotheses in the actual study referenced by Forbes, because they also tracked smokers.)
So significance here means the probability of drawing a random sample from the population, with the exact percentages and so forth that we see in our sample, assuming there is in fact no link between passive smoking and lung cancer rates (that the null hypothesis is true.)
We can see a few things here. Given typical sample sizes, if 2.00% of women who lived with a smoker get lung cancer and 2.00% of women who never lived with a smoker get lung cancer, there won't be a significant effect. The core reason is that this is exactly what we'd expect to see if the null hypothesis is true. If there's no actual link between these variables in the population, it's likely that we'd draw a random sample that looked like ours – a sample with no differences between the groups. This likelihood goes up as the sample size goes up. In this kind of scenario, your p-value might be something like 0.70 or 0.80. The particular value doesn't matter – what matters is that it's well above our threshold of 0.05 (and more importantly, that there is no difference between groups, no effect.)
If we do see differences between groups in our sample, the p-value will be lower, because if the null hypothesis is true, we wouldn't expect to see such differences.
How low that p-value goes will depend on the size of the difference between these groups (the effect size) and the sample size. As the effect size goes up, the p-value goes down because it becomes less and less likely that we'd see such differences in a random sample if the null hypothesis is true.
If the sample size goes up, the p-value goes down, because a larger sample size reduces the likelihood of random sampling error. It's like flipping a coin over and over. If you flip it only three times, you could easily get three straight tails, but as you keep flipping you'll get to a more or less even split of heads and tails.
As I said, our threshold is 0.05, meaning a 5% or less chance that we would draw a sample like ours if the null hypothesis were true.
Why .05? At the margins, it's arbitrary. What's the point of a threshold? The point is to have some standard that reduces Type 1 error – detecting an effect that is not real. At the same time, we want to be able to talk about effects and to report findings that are likely to be real.
Scientists could have settled on .10 or .04 or any of a number of values. Like I said, at the margins it's arbitrary. The specific choice of .05 is partly due to the fact that you probably have ten fingers. You might remember I asked you to count them. A lot of our choices of thresholds and rules of thumb are driven by the fact that we use a base-ten number system. Five is half of ten, and so we tend to settle on values that are multiples of five or ten. There was almost no chance we would've chosen .04 or .06. Those numbers don't satisfy us the way fives and tens do. (If humans had eight fingers instead of ten, we might very well have chosen .04.)
As you can infer from above, significance is a continuum. We could have a p = 0.08 situation and that effect could easily be a true effect. In fact, an effect with p = 0.30 could be a true effect. But especially in that case, when we're getting up to a 30% chance of drawing a sample like ours, we don't want to report that as significant. Whereas, effects with 0.06 or 0.09 p-values are often reported, and should be. We report them as something like "this was significant at p = 0.06" or "this was marginally significant at p = 0.09". Note that we're still using the word significant. We can use that word given any p-value, as long as we include the p-value.
That's why Snowdon is wrong. The choice of 0.05 is a business decision that achieves a good tradeoff in our levels of Type 1 error vs Type 2 error (failing to detect a true effect.) But there's nothing natural or inherently meaningful about p = 0.05. It's not a value derived from nature, like Planck's constant. It wasn't discovered. There's no "significance" in nature. Like I said, it's a business decision.
I wanted to clarify something based on a comment by a supporter on the discrimination post.
Here's my recent Comment on Verheggen et al. in Environmental Science & Technology.
I was denied admission to a PhD program because of my perceived political views: reflections of a sellout; how diversity would strengthen social science (Updated)
Some of my mentors advised me to take down the post from July 22, 2014 about discrimination because it would hurt me in the job market – that hiring committees would discriminate against me (again) for my perceived political views, or would discriminate against me for documenting past discrimination.
José L. Duarte
Social Psychology, Scientific Validity, and Research Methods.