How AI misrepresents culture through a facial expression.
Imagine a time traveler journeyed to various times and places throughout human history and showed soldiers and warriors of the periods what a “selfie” is.
This is the premise for a series of AI-generated images posted on r/midjourney. Below are a few examples of the images this prompt produced:
There are 18 images in the Reddit slideshow and they all feature the same recurring composition and facial expression. For some, this sequence of smiling faces elicits a sense of warmth and joyousness, comprising a visual narrative of some sort of shared humanity (so long as one pays no attention to the incongruousness of Spanish Conquistadors smiling happily next to Aztec warriors. Awkward.) But what immediately jumped out at me is that these AI-generated images were beaming a secret message hidden in plain sight. A steganographic deception within the pixels, perfectly legible to your brain yet without the conscious awareness that it’s being conned. Like other AI “hallucinations,” these algorithmic extrusions were telling a made up story with a straight face — or, as the story turns out, with a lying smile.
Why do you smile the way you do? A silly question, of course, since it’s only “natural” to smile the way you do, isn’t it? It’s common sense. How else would someone smile?
As a person who was not born in the U.S., who immigrated here from the former Soviet Union, as I did, this question is not so simple. In 2006, as part of her Ph.D. dissertation, “The Phenomenon of the Smile in Russian, British and American Cultures,” Maria Arapova, a professor of Russian language and cross-cultural studies at Lomonosov Moscow State University, asked 130 university students from the U.S., Europe, and Russia to imagine they had just made eye contact with a stranger in a public place — at the bus stop, near an elevator, on the subway, etc.
Which, she asked the participants, would you do next?
A) smile and then look away
B) look away
C) gaze at his eyes, then look away
90% of Americans and Europeans chose the option with a smile in it. Only 15% of Russians did.
How we smile, when we smile, why we smile, and what it means is deeply culturally contextual. In the 2018 Nautilus essay, “What a Russian Smile Means,” French-American journalist Camille Baker writes about how the meaning of a smile differs across societies.
In 2015 Kuba Krys, a researcher at the Polish Academy of Sciences, studied the reactions of more than 5,000 people from 44 cultures to a series of photographs of smiling and unsmiling men and women of different races. He and his colleagues found that subjects who were socialized in cultures with low levels of “uncertainty avoidance” — which refers to the level at which someone engages with norms, traditions, and bureaucracy to avoid ambiguity — were more likely to believe that smiling faces looked unintelligent. These subjects considered the future to be uncertain, and smiling — a behavior associated with confidence — to be inadvisable. Russian culture ranks very low on uncertainty avoidance, and Russians rate the intelligence of a smiling face significantly lower than other cultures.
Krys’s team also found that people from countries with high levels of government corruption were more likely to rate a smiling face as dishonest. Russians — whose culture ranked 135 out of 180 in a recent worldwide survey of corruption levels — rated smiling faces as honest with less frequency than 35 of the 44 cultures studied. Corruption corrupts smiling, too.
Russians interpret the expressions of their officials and leaders differently from Americans. Americans expect public figures to smile at them as a means of emphasizing social order and calm. Russians, on the other hand, find it appropriate for public officials to maintain a solemn expression in public, as their behavior is expected to mirror the serious nature of their work. A toothy “dominance smile” from an important American public figure inspires feelings of confidence and promise in Americans. Russians expect, instead, a stern look from their leaders meant to demonstrate “serious intentions, validity, and reliability.”
Which is how an AI trained on a dataset dominated by a culture that takes photos like this:
Would insist that “Native American warriors” posing for a photo would have looked like this:
Rather than how actual historical photos look like:
(Worth noting that these photos themselves were taken by photographers like Edward Curtis and others whose own colonizing point of view influenced how Native peoples were represented, so these images too come with an imposed, outside perspective.)
Or that “Ancient Polynesian Warriors” would have taken a selfie like this:
When the traditional Māori Haka ceremony looks like this:
Or that Soviet soldiers posing for a selfie would have looked like this:
When Eastern European soldiers posing for an actual selfie in 2023 look like this:
Every American knows to say “cheese” when taking a photo, and, therefore, so does the AI when generating new images based on the pattern established by previous ones. But it wasn’t always like this. More than a century after the first photograph was captured, a reference to “cheesing” for photos first appeared in a local Texas newspaper in 1943. “Need To Put On A Smile?” the headline asked, “Here’s How: Say ‘Cheese.’” The article quoted former U.S. ambassador Joseph E. Davies who explained that this influencer photo hack would be “Guaranteed to make you look pleasant no matter what you’re thinking […] it’s an automatic smile.” Davies served as ambassador under Franklin D. Roosevelt to the U.S.S.R.
As the old Soviet joke goes, how can you tell that someone is an American in Russia?
They’re smiling.
But how does AI tell when someone is most likely lying? They’re smiling like an American.
In 2018, researchers at the University of Rochester conducted an experiment to see how deception is connected to facial expressions. Participants were paired up into describer and interrogator roles. The describer was shown an image and told to memorize it with as much detail as possible. They were then instructed to either lie or tell the truth about what they’d just seen to the interrogator, who was unaware of the instructions given to the describer. The recorded exchanges between 151 pairs of individual participants yielded 1.3 million frames of facial expressions. The researchers then used machine learning to automatically find patterns. Without any predetermined labels or categories, the results identified the expression most frequently associated with lying: a “high intensity version” of the Duchenne smile — a smile that extends to both the cheek/eye and mouth muscles.
“Cheese!”
The taut, grimacing, duplicitous rictus — the modern American smile — rose out of a great emotional shift in the 18th century, theorizes Christina Kotchemidova, who teaches theory, gender, and intercultural communication at Spring Hill College in Alabama. But it is also based on a lie.
As Baker writes:
Prior to this shift, [Kotchemidova] believes, the American emotional landscape revolved around negative emotions like sadness and melancholy, which were seen as indicative of compassion and nobleness. Informed by ideas from pre- and early Reformation European Christianity, both Americans and Europeans saw earthly suffering as noble and necessary for a happy afterlife. Literature, visual art, and theater in this period aimed to provoke sadness, and crying in public was commonplace in Europe. Diderot and Voltaire, Kotchemidova writes, were seen crying repeatedly.
The Age of Enlightenment pushed the culture in a different direction. As thinkers and artists embraced reason, they also began to believe that happiness was permissible during our earthly life as well as the afterlife. The culture of sadness began to be supplanted by one of cheerfulness, which in turn influenced a changing class structure. The emerging middle class took the ability to manage emotions as key to its identity. Business failures and sickness were linked to failures of emotional control, and cheerfulness to prosperity. Eventually, cheerfulness became a prerequisite for employment.
“The expectation was, you have to smile eight hours a day,” a woman Baker calls Sofiya tells her. A 41-year-old Russian émigré who had been living in the United States for the past decade, Sofiya “was a proficient English speaker,” Baker writes, but it was in her job as a bank teller that she “came face-to-face with her deficiency in speaking ‘American.’ This other English language, made up of not just words but also facial expressions and habits of conversation subtle enough to feel imagined. Smiling almost constantly was at the core of her duties as a teller. As she smiled at one customer after another, she would wince inwardly at how silly it felt. There was no reason to smile at her clients, she thought, since there was nothing particularly funny or heartwarming about their interactions. And her face hurt.”
This confrontation with the culture clash of smiling for an Eastern European immigrant in America hits close to home. Which is why seeing the relentless parade of toothy, ahistorical, quintessentially American, “cheese” smiles plastered on the faces of every civilization in the world across time and space was immediately jarring. It was as if the AI had cast 21st century Americans to put on different costumes and play the various cultures of the world. Which, of course, it had.
In her groundbreaking book, How Emotions are Made: The Secret Life of the Brain, Lisa Feldman Barrett, a neuroscientist and psychology professor at Northeastern University writes:
Most scientific research on emotion is conducted in English, using American concepts and American emotion words (and their translations). According to noted linguist Anna Wierzbicka, English has been a conceptual prison for the science of emotion. “English terms of emotion constitute a folk taxonomy, not an objective, culture-free analytic framework, so obviously we cannot assume that English words such as disgust, fear, or shame are clues to universal human concepts, or to basic psychological realities.” To make matters even more imperialistic, these emotion words are from twentieth-century English, and there’s evidence that some are fairly modern. The concept of “Emotion” itself is an invention of the seventeenth century. Before that, scholars wrote about passions, sentiments, and other concepts that had somewhat different meanings.
Different languages describe diverse human experience in different ways — emotions and other mental events, colors, body parts, direction, time, spatial relations, and causality. The diversity from language to language is astonishing…. Not all cultures understand emotions as mental states. The Ifaluk of Micronesia consider emotions transactions between people. To them, anger is not a feeling of rage, a scowl, a pounding fist, or a loud yelling voice, all within the skin of one person, but a situation in which two people are engaged in a script — a dance, if you will — around a common goal. In the Ifaluk view, anger does not “live” inside either participant.
In the same way that English language emotion concepts have colonized psychology, AI dominated by American-influenced image sources is producing a new visual monoculture of facial expressions. As we increasingly seek our own likenesses in AI reflections, what does it mean for the distinct cultural histories and meanings of facial expressions to become mischaracterized, homogenized, subsumed under the dominant dataset? In the AI-generated visual future, will we know that Native Americans didn’t smile for photos like WW2 U.S. Navy Officers?
Did U.S. WW2 Navy officers even for that matter?
Every 2 weeks a language is spoken by a human on Earth for the last time. A third of the world’s languages have fewer than 1,000 speakers left. By the end of this century, 50% to 90% of languages are predicted to disappear. “When humanity loses a language, we also lose the potential for greater diversity in art, music, literature, and oral traditions,” says Bogre Udell co-founder of the nonprofit, Wikitongues. And with them, we lose vital concepts for making sense of ourselves and our inner experiences. In the future, how will we know how to be who the algorithm doesn’t show us we can be? Would we even dare to want to? And what would that kind of totalizing assimilation mean for global mental health, wellbeing, and the human experience in general?
It’s so much easier for a machine to fabricate a new language (out of the raw material created by humans) —
Than for the diversity of human expression to survive algorithmic hegemony.
“The concept of a facial ‘expression,’” writes Barrett, “Implies an internal feeling that seeks release in a set of facial movements.” In flattening the diversity of facial expressions of civilizations around the world AI had collapsed the spectrum of history, culture, photography, and emotion concepts into a singular, monolithic perspective. It presented a false visual narrative about the universality of something that in the real world — where real humans have lived and created culture, expression, and meaning for hundreds of thousands of years — is anything but uniform.