AI Can’t Tell if You’re Gay… But it Can Tell if You’re a Walking Stereotype.
[First published September 9th. This post has been updated with subsequent information, denoted by square brackets, cited on this resource of responses to the story by tweeters, bloggers, media, and Kosinski himself. My reflection on peer review, ethics, and LGBTQ big data is here. This formed the basis of a peer-reviewed piece, freely available here.]
There are ample reasons to be skeptical of recent headlines announcing that “AI [Artificial Intelligence] Can Tell If You’re Gay,” summaries of a pre-print study by Yilun Wang and Michal Kosinski of Stanford University’s School of Business. Their goal is to “advance our understanding of the origins of sexual orientation and the limits of human perception” (p.1). For the first they fail miserably but I concur with the second, though the perceptions that are limited, in this case, are the researchers’ own. In this post I review the underpinnings of this research that render it much less insightful than the researchers claim, the problems of journalistic reporting that compound these problems, and the stunning tone-deafness of Kosinski’s defense of his ethics.
Wang and Kosinki (hereafter, WAK) are only the most recent example of a long history of discredited studies attempting to determine the truth of sexual orientation in the body. These ranged from 19th century measurements of lesbians’ clitorises and homosexual men’s hips, to late 20th century claims to have discovered “gay genes,” “gay brains,” “gay ring fingers,” “lesbian ears,” “gay scalp hair,” or other physical differences between homosexual and heterosexual bodies. For WAK, facial differences “are consistent with the prenatal hormone theory of sexual orientation,” by which “gay men and women tended to have gender-atypical facial morphology, expression, and grooming styles” (p. 1). In citing a reductive version of hormone theory, WAK thus recycle 19th century sexual inversion theory, which posits that lesbians are hypermasculine women and homosexuals are effeminate men.
WAK embrace ties to 19th century science, however: “physiognomists’ main claim–that the character is to some extent displayed on one’s face–seems to be correct” (p. 8). In their paper, WAK frame their study as a bold step against a “taboo” against “studying or even discussing the links between facial features and character.” Here they stumble. There is a taboo against eating feces. It is impolite to tell homophobic jokes. To suggest that homosexuality is an issue of “character” akin to “a lack of energy and determination” or being an arsonist, as they do on page 3, is to deliberately freight your research with moral judgements and social biases. WAK now chafe against claims their project is pseudoscience (it’s not, unfortunately), but when you lay down with dogs, you wake up to a critique by fellow AI scholar Kate Crawford that you’ve produced “more AI phrenology.”
Stereotypes about men and women profoundly taint WAK’s research, as does their confusion of sex, gender, sexuality, and cultural practices. Typical of AI studies is a hermetic resistance to any contributions from the fields of sociology, cultural anthropology, feminism, or LGBT studies. For more than a decade, sociologists have developed sophisticated biosocial models of the relationships between testosterone and other inequalities that WAK ignore, notably social status and family and peer relationships. Literally the first sentence of a decade-old review of the field is “Public perceptions of the effect of testosterone on ‘manly’ behavior are inaccurate” (Booth et. al. 2006). For feminist scholars of science, this ground is well-trod, showing how pre-existing gender stereotypes have shaped the cultural meanings of hormones or sperm and eggs, and not merely described empirical reality. Studies also consistently find that there is more variation within the sexes than between them–studies that themselves ignore the diversity of intersex and transgender people.
WAK’s paper and accompanying Authors’ note consistently conflate cultural practices with a mythical, fixed, universal sexual orientation. Hormones may well cause morphological differences in bodies, but this is the first study to suggest that lady fauxhawks, men’s sculpted eyebrows or other “grooming styles” are related to intra-uterine experiences. In his 2012 book How To Be Gay, David Halperin provided the definitive exploration of the supremacy of cultural behaviors. Being a gay man has almost nothing to do with manhumping, but rather with the careful cultivation of camp humor, a certain timbre of voice, or other learned styles. WAK practically write the book on dyke style: “lesbians tended to use less eye makeup, had darker hair, and wore less revealing clothes” (p. 20); they also smile less and wear baseball caps (p. 21). It’s puzzling to think that two smart people who grew up outside the U.S. would think these are universal cultural responses to uterine hormones. But then again, WAK think that nose shape and cheekbones are fixed landmark contours; if they’d ever met a drag queen, they’d know contour is a verb. There is a stunning assumption here: that dating site photos and profiles are unmediated, un manipulated, accurate facsimiles of the real body (tl;dr they’re not).
WAK are keen to address critiques of their study’s dismissal of bisexuality or gender variance by claiming that those are different things that they weren’t studying. Yet by preloading their database with examples of only white, openly gay, cisgender men, WAK were sampling on their dependent variable. They determined gay faces by analyzing Facebook users who liked such pages as “I love being gay” and “gay and fabulous,” and who publicly self-reported romantic interest in only one of “both genders.” And then something interesting happened: “Unfortunately, we were not able to reliably identify heterosexual Facebook users” (p. 28). WAK’s limited perception of the complexity of sexual orientation is a problem because, as Tristan Bridges reports, surveys show not only that bisexuality is much more common these days, but that it is also the fastest growing category of reported same-sex identity for women. It’s also among the most stigmatized, both in and outside the LGBTQ community. Treating it as conceptually distinct from homosexuality is as wrong now as it was when Laud Humphreys (1970) found that about half of the men having sex in public bathrooms lived conventional heterosexual lives. As sociologist Philip Cohen summarized similarly bad research five years ago, “this is all complicated by social stigma around sexual orientation. So who identifies as what, and to whom, is never free from political or power issues.”
WAK are not interested in homosexuality as a behavior, clearly, or even as an identity, but in that specific form of gayness for which clicking “I love being gay” on Facebook is important (unfriend me now). This is a problem because sexual orientation changes over the life course, is determined by multiple factors, and is expressed in different ways that cannot be reduced to each other. Since the early 1990s, survey researchers have distinguished between same-sex identity, desires, and behavior. This practice reflects hard-won insights from HIV/AIDS research and the LGBTQ rights movement that people who have same-sex desires may not have an identity or be having any same-sex sex, and that individuals having same-sex sex may do so without having particular desire for it, or may have such sex while identifying as heterosexual. Jane Ward shows the racial privilege of this kind of same-sex behavior in her 2015 blockbuster book Not Gay, showing how white male “bromosexuals” who have “dudesex” with each other are never policed or demonized. African-American men on the “down low,” in contrast, are a topic of intense moral and medical interventions. That “straight men” are having “gay sex”–and everyone who read anything in 2015 knows it–should have given WAK pause. But where Angels in America fear to tread…
I will leave it to geneticists and biologists to critique the ways that WAK confuse correlations and causation (gender atypicality may cause gayness as much as gayness causes gender atypical behaviors). They might also critique the way WAK ignore 15 years of epigenetics research that shows that gene expressions (and the hormones that are regulated by them) are altered by complex interactions with environment and behavior over the life course. [Bloggers Bjork-James, Cohen, and Gelman offer important critiques of the data and its interpretation].
I will, however, point out that WAK’s use of Amazon Mechanical Turk workers muddles conceptual categories at both the training and execution phases. In Maciej Ceglowski’s memorable phrase, “Machine learning is like money laundering for bias.” WAK’s paper provides the following example of training materials:
(That’s Kosinski himself as the model adult complete male Caucasian, by the way, with his girlfriend as the female-looking face). The problems here are legion: Barack Obama is biracial but simply “Black” by American cultural norms. “Clearly Latino” begs the question “to whom?” Latino is an ethnic category, not a racial one: many Latinos already are Caucasian, and increasingly so. By training their workers according to stereotypical American categories, WAK’s algorithm can only spit out the garbage they put in.
WAK might be most productively directed to the work of artist and writer Zach Blas. His Facial Weaponization Suite (2011-2014) included Fag Face Mask (2012) as a protest against the exact facial recognition technologies WAK have decided to warn us about five years too late. Blas created physical, pink blobby masks that are composites of multiple biometric scans from LGBT men, rendering its wearers both identifiably queer in real life but illegible to facial recognition technology.
As the explanatory video on Blas’s website intones, “Biometric technologies rely on stable and normative conceptions of identity, and thus, structural failures are encoded in biometrics.” Social science researchers routinely describe sexuality as not fixed, shifting across history for populations and across the life-course for individuals, and varying dramatically across cultures. The false fixity and overdetermination built into AI technologies causes them to comprehensively discriminate against the categories excluded from them. In an essay Blas cites the work of Shoshana Magnet, who describes these as “demographic failures” baked into the technologies which then routinely fail to identify the groups who were excluded from algorithmic training: the hands of Asian women, eyes with cataracts, the facial features of many African Americans, the elderly, and people with disabilities. She concludes, simply and ominously, that “human bodies are not biometrifiable” (p. 2).
WAK aren’t totally to blame for the hype, which is amplified by breathless journalistic reporting. Our appetite for these simplified tales say more about our collective desire for omnipotent technologies and a comfortable world of fixed identities than they do about the messy interfaces of carnal bodies. Not all journalism falls into this trap, of course. The day-to-day sloppy short-from reportage (reliably extruded and polished by Scientific American’s Jesse Bering) regularly plays straight man to nuanced long-form pieces that conclude, year after year, that such studies are both less insightful and more ethically dubious than originally thought.
WAK have inadvertently trained their deep neural network to discover the most shallow reality of the gay world: gayface (dykeface is the less-recognized lesbian analogue). Gayface, a word consecrated in the venerable Urban Dictionary in 2006, is the particular cocked-head, arched-eyebrows and pursed-lip expression that is the facial equivalent of drawing back the string of a catty longbow. There is no essential relationship between gayface and buttfucking, as any fan of the 2001 movie “Zoolander” knows, and herein lies the biggest challenge to the validity of WAK’s model.
In that film, heterosexual actor Ben Stiller plays the eponymous male model Derek Zoolander whose dubs his signature facial expression “Blue Steel.” The humor of Blue Steel in the film lies in mocking the labor of models and their presumed homosexuality (Zoolander 2 ups the ante by adding transphobia to the mix). Its pleasure beyond the film, however, gave us the delightful scene of Stiller teaching iconically uber-gay Elton John to make the Blue Steel pose, a masterful exercise in gaysplaining (straight people helpfully but patronizingly explaining gay things to gay people). Blue Steel stories are now a staple item for slow news weeks in the celebrity world, with magazines and blogs “catching” celebrities making the face, and then ribbing them for it.
Michal Kosinski is dismayed by ethical critiques of his paper, defending it on Twitter, Facebook, and public google docs that he edited through the weekend. WAK’s abstract does end with the warning that “our findings expose a threat to the privacy and safety of gay men and women.” This smacks of the local nightly news fearmongering about “the toxic poison lurking in your kitchen–after these commercial messages.” Indeed, WAK “were terrified” to learn that there are risks to the privacy of LGBTQ individuals, which they present as “the core message of the study” (Authors’ note p. 1). Mired as it is in outdated stereotypes, WAK should not be surprised that people didn’t hear their message for the offensive noise.
Kosinski also defends his project because WAK merely “studied existing technologies–already widely used by companies and governments.” What is “existing technology” at Stanford Business School may not be elsewhere, and he and his critics are quick to identify homophobic regimes in Bishkek or Pyongyang, as if bad heteros all live far away. I’m more concerned that the very publicity Kosinski courts will put this tool in the hands of the North Carolina bathroom police. I would be interested as well to know if WAK used OKCupid to hone their algorithms, because such use violates its user agreement and has already been abused, causing a massive breach in privacy.
Kosinski pushed back against criticisms from LGBTQ human rights groups HRC and GLAAD, saying he is saddened by their “smear campaign” and accusing them of bullying journalists to discredit him. It is indeed sad when gays terrorize journalists; the reverse has never been true, and glitter is so hard to remove. Kosinski doubles down in an article in the Guardian, where he is quoted as saying: “Rejecting the results because you don’t agree with them on an ideological level … you might be harming the very people that you care about.” It takes mindboggling arrogance to think that your one-off warning is equivalent to GLAAD’s 30+ years of hard work.
There is also something disingenuous about the Stanford Business Professor issuing a warning to the gays with one hand, and, with the other, advising venture capital funds and an Israeli security firm. The company called Faception, which prominently listed Kosinski as an advisor in its 2016 presentations [Kosinski denies this association, denying he has ever been on Faceptions board of advisors, provided them ethical advice only, and has no commercial interest in any predictive technology whatsoever], promises to help clients deploy “facial personality profiling” to catch pedophiles and terrorists, among others:
Presumably the published article won’t contain any details of financial relationships between the researchers and such firms, either because no money changed hands or because it didn’t fund the specific research here. While targeting baddies may make the firm seem like the geeky James Bond we need, the high rate of false positives should give us all pause. As Kosinski admitted to a journalist from Business Insider, “even the most accurate model aimed at a rare outcome will produce a great majority of false positives.” How do you shed the stigma after a private company says you are 46% likely to be a pedophile?
Kosinski might have known better. After all, it was only last year that he was “horrified” to be suspected of developing the tools used by Steve Bannon-affiliated firm Cambridge Analytica to create winning campaigns for Brexit and Donald Trump [charges that were circumstantial and that he has forcefully denied]. He got over it quickly. “This is not my fault. I did not build the bomb. I only showed that it exists,” he told journalists for Swiss Magazine Das Magazin, translated by Vice’s Motherboard (“a good article,” Kosinski praised on Facebook). But WAK never address the fact that their research gives them the ability to sell both a refined bomb and a shelter from it–or to give either away for free.
What’s creepier than Kosinski’s flawed algorithmics is his naïve confidence in the moral and political neutrality of science. On Twitter, Kosinski pleads: “no one seems to be asking ‘assume for a second [we] are right, what should we do to protect ourselves?’ (Sep 8). I went looking, and he has no answers for how we might shelter ourselves. Instead, he says we should just accept that we live in a post-privacy world, a intellectual stance that both explains the utility of his research and justifies it. Privacy is the refuge of the powerful, and the powerful can afford both shelters from bombs and to walk the streets naked. The poor, the undocumented, racial minorities, and bad queers don’t need righteous allies to tell us we’re surveilled. Kosinski thinks that tech companies or the government should step in to protect us, part of what Stanford Business School’s Insights described as “his optimistic vision of a world with less privacy but more tolerance.” Pause here for a moment. Kosinski thinks Facebook and Jeff Sessions will protect flaming gay boys and baby bulldykes from anyone inspired by his research. Until we all get to live in WAK’s fantasy world, they owe us tools and strategies to protect ourselves from algorithms like theirs, though doing so would require them to dismount from some pretty high horses.
In conclusion, I have good and bad news for WAK-inspired research. The bad news is that their algorithms are probably accurate only for the most visibly, out-and-proud, gender-nonconforming dykes and gay men. The Instagram account the_LA_basics is definitely in danger. The good news, if we can call it such, is that dykes and queens already know we are visible to the analog meat machines who gaysplain and microaggress against us each day. WAK have invented the algorithmic equivalent of an 13-year-old bully
, and claim their abomination is in defense of teh geyz.
The saddest news–for all of us–is the peer review process at the Journal of Personality and Social Psychology allowed Wang and Kosinski to fling centuries-old turds without noticing the stink, and ignore 50 years of sociological and feminist evidence in the process. Them’s some contours AI’s fanboys should face.
Greggor Mattson is Associate Professor of Sociology at Oberlin College and Director of the Program in Gender, Sexuality, and Feminist Studies. Of his author photo, friends said “gay,” “could be gayer,” and “the right amount of gay.” He received much great advice, including from Jonathan Doucette, Crystal Biruk, Davey Shlasko, Charley Sullivan, Jason Orne, Sarah Quinn, and Cynthia Taylor. He blogs at greggormattson.com and @greggormattson.
Bérubé, Allan. “How Gay Stays White and What Kind of White It Stays.” The Making and Unmaking of Whiteness, 2001, 234–265.
Blas, Zach. “Escaping the Face: Biometric Facial Recognition and the Facial Weaponization Suite.” New Media Caucus Media-N, July 10, 2013.
Booth, Alan, Douglas A. Granger, Allan Mazur, and Katie T. Kivlighan. “Testosterone and Social Behavior.” Social Forces 85, no. 1 (September 1, 2006): 167–91.
Bridges, Tristan. “2016 GSS Update on the U.S. LGB Population.” Inequality by (Interior) Design, April 4, 2017.
Browne, Simone. Dark Matters: On the Surveillance of Blackness. Durham: Duke University Press Books, 2015.
Cheney-Lippold, John. We Are Data: Algorithms and The Making of Our Digital Selves. New York: NYU Press, 2017.
Halperin, David M. How To Be Gay. Cambridge, Mass.: Belknap Press: An Imprint of Harvard University Press, 2014.
Humphreys, Laud. Tearoom Trade: Impersonal Sex in Public Places. Chicago: Aldine, 1970.
Laumann, Edward O., John H. Gagnon, Robert T. Michael, and Stuart Michaels. The Social Organization of Sexuality: Sexual Practices in the United States. University of Chicago Press, 1994.
Levin, Sam. “LGBT Groups Denounce ‘Dangerous’ AI That Uses Your Face to Guess Sexuality. The Guardian.” Accessed September 9, 2017.
Magnet, Shoshana Amielle. When Biometrics Fail: Gender, Race, and the Technology of Identity. Durham: Duke University Press Books, 2011.
Oudshoorn, Nelly. Beyond the Natural Body: An Archaeology of Sex Hormones. 1 edition. New York ; London: Routledge, 1994.
Ward, Jane. Not Gay: Sex between Straight White Men. NYU Press, 2015.