More than words: prosody in the brain
The brain's dedicated hardware for handling the melody of speech overlaps with regions involved in language processing and recognizing facial expressions.
It's not what you said, it's how you said it.
Prosody — a catch-all for variations in pitch, volume, tempo, and more that give speech its melody — does quite a lot of work in human communication. Prosody conveys our emotions, which can completely change the meaning of a statement in context. The same words, screamed, stated as a matter-of-fact, or delivered dripping in sarcasm, can mean vastly different things.
But prosody goes far deeper than adding a layer of emotion over the literal meaning of speech. The way we speak, as opposed to what we say, can change the meaning of an utterance entirely. It can tag particular words as new or important, mark the ends of words or phrases, and even transform statements into questions. In English, ascending pitch acts like an auditory question mark superimposed over the last syllable of a sentence. Other languages do things differently: Hungarian turns simple statements of fact into questions with a sharp rise in pitch on the second-to-last syllable, followed by a drop.
While interpreting prosody is intuitive to the point of being automatic for most of us (not everyone, we'll get to that), researchers have struggled to pin down where — and how — the brain deals with it.
But on December 14, an MIT-led research team published a preprint showing that speech melodies are processed in dedicated brain regions entirely distinct from the auditory regions that handle speech sounds. The brain regions sensitive to prosody do, however partially overlap with language areas and a brain region attuned to facial expressions. This could be a hint that prosody helps integrate linguistic information with non-verbal signals — it's at the nexus of literal and contextual meaning.
And while the research team didn't get into explicitly in their study, I think their finding could be interesting to follow up on in the context of neurodevelopmental conditions, especially autism. But we'll get to that.
First, some background. Prosody involves several different layers of cognition, all happening simultaneously. First, there's auditory perception, so just hearing things like pitch, volume, and tempo. Then there's language processing, which doesn't necessarily have to involve auditory perception. Then there's the social aspect, which includes both perceiving and reasoning about social situations.
Previous research established that each of these abilities relies on pretty specialized bits of brain. But it wasn't clear whether prosody involved these brain bits or whether it might require its very own neural machinery. The methods used in previous studies weren't powerful enough or because they involved unusual patient populations like brain damage patients. Some studies contradict each other: a few associate prosody perception more strongly with the right hemisphere of the brain, while others pin it to the left hemisphere. And while studies did generally agree that prosody is processed in particular brain regions, those brain regions are pretty big. So it wasn't possible to tell if prosody processing overlapped with other functions, like perceiving pitch or decoding the meaning of facial expressions.
To pin down prosody in the brain, the research team combined powerful fMRI imaging with several experiments designed to disentangle prosody processing from other processes that, when we listen to natural speech, happen at exactly the same time. The key experiment contrasted 51 participants' brain responses to sentences either spoken with "expressive, natural prosody" or in a flat, disrupted tone —as well as responses to artificial "sentences" of nonsense that sounds like English but doesn't mean anything, both with and without natural-sounding prosody. This allowed them to check if the brain can pick up on prosody cues even if it can't understand the statement (as pre-linguistic babies seem to do when they respond to friendly coos with a smile and scolding with tears).
What the researchers found suggests that prosody does indeed have some of its own neural hardware — but also recruits parts of the brain involved in language processing and recognizing facial expressions.
The fMRI scans revealed a set of brain areas that reliably "lit up" (i.e., received increased blood flow) when presented with prosody, regardless of whether the study participants were listening to real English or meaningless gibberish. That suggests that interpreting prosody happens at least partially independent of processing linguistic meaning — so if you're going to insult a baby, be sure to do it in a nice tone of voice.
The prosody hot spots were entirely separate from auditory brain areas involved in processing pitch and speech sounds (e.g. "p" vs "b" vs "t"), and from brain regions associated with paying attention to things in general.
This finding is very interesting in the context of other research on non-verbal communication. The general rule is that non-verbal cues are processed separately from language; language-processing regions of the brain don't react much to facial expressions, gestures, and noises like sighs and laughter. Different channels of communication are processed in parallel, rather than together.
That's sort of true for prosody, but not entirely. It has its own hardware, but also shares with brain regions involved in processing language and at least one social cue. Maybe that makes sense: prosody is unique because it is inseparable from speech. It simply can't be cut it out of language. Sign language, despite being silent, has its own prosody. And written texts use all sorts of tricks to imply prosody (and are often misunderstood when they fail to do so well).
Personally, reading this paper set off a lightbulb for me. I feel like it helped me understand my own brain a bit better, and I hope that it will set the stage for future studies on the intersection of linguistic and social cognition — especially in the context of neurodevelopmental conditions like autism. Because the trifecta of prosody, facial expressions, and language sure does sound relevant to autistic communication challenges.
Social deficits — including, specifically, nonverbal communication of "emotions and affect," which involves prosody — are part of the diagnostic criteria for autism. Anecdotally, autistic people often struggle to understand sarcasm or "take a hint" when allistic conversation partners use nonverbal cues to get a message across. Flat affect and unusual speech cadences are common among autistic people, including people I know personally. So are difficulties and differences when it comes to both interpreting facial expression and to using language — both of which, this study suggests, interact with prosody in the brain.
Without getting into any diagnoses, let's just say I have a complicated relationship to prosody. I'm not great at picking up on prosody cues in speech and worse at reliably producing them. This has been an enduring and often painful point of communication breakdown and conflict with some of my family members. I'm also not fantastic at reading — and, again, worse at producing — the "right" facial expressions during conversations. If I ever interview you, you might notice that I smile. A lot. Too much. That's because I don't know what else to do with my face.
However, I'm good at language, especially at the sound-making bit. Produce a sentence for me, and I can usually repeat it back whether or not I have any idea what it meant — in fact, I actually have trouble mimicking sounds and processing meaning at the same time. The two tasks feel like entirely different tracks in my brain. And in foreign languages (but also a bit in English), only one of those tracks feels like it can be active at once. I don't speak accentless German when talking naturally, but I can produce individual German sounds, words, and sentences in isolation that I have been told are basically accent-free. I once wowed a random Czech scientist at a conference by parroting the notorious ř sound that makes foreigners cry (I'm sure it'd make me cry, too, if I were actually trying to speak Czech rather than just make Czech noises). I'm learning Italian and Hungarian and my tutors are sometimes kind of baffled that I can repeat sentences back to them that sound perfect, but then follow up with "okay but what did any of that mean?" Even words I know bleach of meaning if I'm too focused on the melody of speech.
So, yes. This study's finding squares with my intuitive experience of prosody as something separate from speech perception that is tangled up with — but distinct from — processing language and facial expressions.
I'm a science journalist, not a neuroscientist. So what follows is basically a wish list. But since the holidays are upon us, I feel entitled to make future research requests of science Santa:
For one, I'd be really interested to see future research build on this result to see if the specific communication challenges and differences of people with autism (and other neurodevelopmental conditions like language disorders) is reflected in differences in prosody processing. Maybe starting from there would help us better understand the neural basis of autistic differences in language processing and non-verbal communication.
Then there's the question of how universal the findings are. This study was done on English speakers listening to English. But do speakers of other languages handle prosody differently? Looking at tonal languages like Mandarin would be especially interesting, since they have a very different relationship to pitch than English does. I'd personally be very interested in whether there are any effects of bilingualism or differences between native speakers and people who learned languages as adults. Learning the prosody of a new language is one of the hardest parts. And I wonder if struggling to interpret the melody of foreign speech might spill over into or otherwise interact with other aspects of social cognition — including challenges associated with neurodiversity.
Thanks for reading
There are many ways you can help:
- Subscribe, if you haven't already!
- Share this post on Bluesky, Twitter/X, LinkedIn, Facebook, or wherever else you hang out online.
- Become a patron for the price of 1 cappuccino per month
- Drop a few bucks in my tip jar
- Send recommendations for research to feature in my monthly paper roundups to elise@reviewertoo.com with the subject line "Paper Roundup Recommendation"
- Tell me about your research for a Q&A post (email enquiries to elise@reviewertoo.com)
- Follow me on Bluesky
- Spread the word!