Across Acoustics
Across Acoustics
Speech research methods and gender-diverse speakers
Traditionally, speech researchers have asked participants to classify speakers on a binary scale for gender. However, as our understanding of gender changes, so must our research methods. In this episode, we talk to Brooke Merritt (University of Texas - El Paso) about her research into updating research protocols to better encompass a diversity of genders and gain a more nuanced understanding of listeners' perception of speakers' identity.
Associated paper: Brooke Merritt, Tessa Bent, Rowan Kilgore, and Cameron Eads. "Auditory free classification of gender diverse speakers" J. Acoust. Soc. Am. 155, 1422-1436 (2024). https://doi.org/10.1121/10.0024521.
Read more from The Journal of the Acoustical Society of America (JASA).
Learn more about Acoustical Society of America Publications.
Music Credit: Min 2019 by minwbu from Pixabay. https://pixabay.com/?utm_source=link-attribution&utm_medium=referral&utm_campaign=music&utm_content=1022
Kat Setzer 00:06
Welcome to Across Acoustics, the official podcast of the Acoustical Society of America's Publications Office. On this podcast, we will highlight research from our four publications. I'm your host, Kat Setzer, editorial associate for the ASA.
Kat Setzer 00:25
Today I'm talking with Brooke Merritt about her article, "Auditory free classification of gender-diverse speakers," which appeared recently in JASA. Thanks for taking the time to speak with me today, Brooke. How are you?
Brooke Merritt 00:35
Hi, Kat, I'm great. Thank you for having me. I'm excited to be here.
Kat Setzer 00:39
I'm really glad to have you. So first, just tell us a bit about your research background.
Brooke Merritt 00:43
Well, I graduated from Indiana University in 2022, where I completed my PhD in Speech, Language and Hearing Sciences with a minor in human sexuality. And I've been interested in gender diversity and its implications for speech communication, really, since the beginning of my graduate program and continuing now into my research lab at the University of Texas at El Paso.And gender diversity, when I say that, it means that, it encompasses the range of gender identities and expressions that exist across humans, generally, and can include things like cisgender, and transgender men and women, and also individuals who don't necessarily align with a gender-binary system. And recently, I've become really interested in gender from the perspective of listeners and what factors impacts the gender that a listener attributes to a talker.
Kat Setzer 01:34
Okay, so what assumptions have been made about perceived gender in speech up until this point?
Brooke Merritt 01:39
Well, most of the available research on how listeners attribute gender to a speaker has assumed this gender binary of cisgender men or boys and women or girls. But there's a lot more diversity in the world than I think what's reflected in much of the existing speech communication literature. And so for this study, we saw an unmet need to examine how listeners navigate these changing socio-cultural landscapes, and specifically gender diversity. And we wanted to know how this diversity might lead to expansion of current or development of new gender categories in listeners' minds, and also what speech characteristics were most important in shaping those categories.
Brooke Merritt 02:20
And so listeners' attribution of gender has typically asked listeners to evaluate speakers on rating scales that have been provided by researchers. So there's been a lot of assumptions sort of built into those paradigms, about how listeners attribute gender, and also a lot of variability in these rating scales across different studies. So for instance, different studies have used different kinds of scales; some might use two or three forced choices like male or female or other, and then others might use a single scale with five to nine radio buttons. And then some even use sliders that range from like one to 100. And then some studies have even placed masculinity and femininity as like two endpoints on one scale, where others might actually consider them to be different dimensions that listeners can attend to and rate. So a lot of these decisions that have been made about these scales are really just from precedent, or what's been done before. And they've generally assumed that those different scales are reflective of listener's sort of perceptual experience of gender.
Kat Setzer 03:22
That's really, really interesting, especially since the two scales being... like the possibility of having masculine and feminine on two separate scales. So then that implies that they're not even opposites, and there isn't a binary even in a way.
Brooke Merritt 03:35
Yeah, absolutely.
Kat Setzer 03:36
Yeah. So how to transgender individuals adjust their speech to convey their queer identity?
Brooke Merritt 03:42
Well, we see that gender is really communicated through a variety of speech characteristics, and many of those are able to be consciously manipulated by us as speakers. And emerging evidence suggests that speakers who aren't cisgender or straight, can, and actually do, manipulate a variety of these features into unique speaking styles that, over time, become associated with particular social groups, one of which might be described as queer people. And so some of these features include maybe the pitch of one's voice or the size and the shape of the vocal tract, vocal quality features like vocal fry, or creaky voice, which is kind of like when you're speaking like this, and even how consonant and vowel sounds are produced.
Kat Setzer 04:27
Okay, so what research has been done with regards to gender-diverse speakers thus far?
Brooke Merritt 04:32
So interest in the speech of what's historically been classified as from men versus as from women goes back decades, centuries, maybe longer. But most of these studies traditionally have assumed that any and all of the speakers were cisgender. But in the last 10 years or so, we're starting to see a much greater interest in better understanding the impact of gender diversity on speech communication, and in that, one perspective is to look at gender diversity as a source of meaningful speech variation. So in other words, examining speech characteristics and even word choices that speakers use to index themselves within particular social groups, as I had mentioned before, but then another perspective and one that has received relatively less attention is to look at the listener and the listeners' perspective, and consider what factors impact how they attribute gender to talkers. And as I had said, much less work has been done in this area. And that's why we were excited to kind of approach this research question. And then regarding the acoustic characteristics that inform how we attribute gender, we see pretty consistently that the pitch of the voice and the resonances of the vocal tract which relate to vowels and voice quality, are really important cues for listeners. But in addition, we see some importance of the way we produce vowels and consonants or how we articulate and even intonation or how much variability in pitch there is in our voice, maybe how melodic it is. And those are also cues that we seem to use when we make judgments about a talker's gender. And so our study here was a step toward removing some of the constraints of researcher-imposed rating scales and labels, and really trying to get a clearer picture of the process of gender attribution with minimal bias.
Kat Setzer 06:21
Okay, so how does a listeners' perception of a speaker's gender affect how they perceive speech?
Brooke Merritt 06:26
Oh, this is really interesting. So when we encounter a talker, we pretty quickly make decisions of characteristics about them. And these characteristics can include things like how old we think they are, where we think they're from, and of course, gender that we attribute. And gender may be one of the first things that we actually think about and attribute to a talker. And based on the characteristics that we've attributed to talkers, we use our prior experiences and knowledge of different talkers to make predictions about how this new talker will sound. And specific to gender, we may adjust what we expect, for instance, their vocal pitch to be or how their vowel and consonant sounds will be produced, even the intonation or cadence of their speech, and what word choices they might make. And then we think that these expectations about gender allow us to be more efficient in adapting to different talkers.
Kat Setzer 07:23
Okay, man, now I want to know if I have feminine sounding vowels. What are features typically attributed to masculine or feminine speech?
Brooke Merritt 07:31
Yeah, yeah, this is interesting. So what we tend to see across listeners is that for instance, speech that is very clearly articulated with clear vowel characteristics is perceived as more feminine sounding. But when you have speech that's a little bit more muddled, or a little bit more mumbled, and the vowels aren't clearly differentiated, people tend to perceive that as a more masculine-sounding voice, and even some of our articulatory features with consonants. So for instance, the phoneme s, there's a lot of social information that's carried in that sound, right? So a very crisp, clear s tends to be perceived as more feminine than masculine.
Kat Setzer 08:09
Oh, my goodness! That's really interesting. So what were the aims of this study?
Brooke Merritt 08:12
Sure, yeah. So we were looking to remove, again, those constraints of researcher-imposed rating scales and labels, and trying to get a clearer picture of how listeners attribute gender to talkers, with minimal bias, without influencing their decisions with words and labels. And so we wanted to understand the organizational factors that listeners use to classify these speakers of diverse gender identities. So we selected a paradigm that we believe introduced as little bias as possible.
Kat Setzer 08:41
Okay, so it sounds like the scales aren't very standardized, but maybe like I was saying before, masculine and feminine aren't even opposites, depending on the scales you're using. So how do you improve the measurement of gendered attribution?
Brooke Merritt 08:52
Sure, I think the primary objectives is to eliminate as much bias as possible, like factors that creep into the research that maybe influenced decisions that you didn't want to influence them. And so our goal is to limit that bias. And we did that in this study by replacing these investigator-defined scales and category labels with a more participant-driven approach, which sort of invites listeners to make their own evaluation of different talkers based on whatever criteria is most meaningful for them.
Kat Setzer 09:24
Very cool. So let's talk about some of those methods. What are auditory-free classification and multi-dimensional scaling, and how are they useful?
Brooke Merritt 09:32
Sure. So, in free classification, which was introduced to our field, I think, by Dr. Cynthia Clopper, maybe, I don't know 15 years ago or so, maybe a bit more. But in that paradigm, listeners see icons on one side of the screen, and then a blank grid on the other side. And when they click an icon, they hear a sentence from one person that plays and then their task as listeners is to group the talkers together based on how similar they sound to one another. And that's a key feature of free classification is to ask listeners to evaluate based on similarity. And so that leaves the door open for them to choose whatever criteria they want for similarity without us telling them how to do that.
Brooke Merritt 10:16
And then multidimensional scaling is a statistical technique that helps to visualize and interpret those groupings, that data. So it determines how many what are called dimensions that sort of best fit listeners' classifications, and then we determine what characteristics about the speaker's best align with or map on to those dimensions. And so this sort of technique works well to identify maybe the most perceptually salient dimensions of a complex signal like speech. And then it allows us to get a sort of at the core similarity among all of the different speakers that kind of emerges across all of the listeners collectively.
Kat Setzer 10:56
So how did you conduct this study?
Brooke Merritt 10:58
For this one, we collected speech recordings from speakers who represented a variety of gender identities like cisgender men and women, transgender men and women, gender-nonbinary speakers, and even agender speakers. And in that free classification task, we asked naive listeners (they didn't know the people they were listening to), they sorted the speakers into groups based on general similarity in one task. And then we actually did cue them with a different word for the second task, and we said, "Now group them by gender identity." And the purpose of that was just to see how one change in those two words, "gender identity" versus "general similarity," impacted their classification strategy, and kind of the dimensions that it would reveal that they use to organize the speakers. And so then we measured a range of acoustic factors from the speaker recordings, and we also collected some auditory perceptual ratings from listeners on things like how masculine or feminine they thought the speakers were, or how old they thought they were. And then we calculated relations between those measures, the acoustic and perceptual measures, and the groupings that we submitted to multi-dimensional scaling.
Kat Setzer 12:13
Okay. Did how listeners group the voices differ based on how you asked them to group them?
Brooke Merritt 12:19
Yeah, that was a pretty exciting finding, actually. So we saw that when we asked them to group by similarity, we found a more complex kind of solution. It was a two-dimensional space that listeners were kind of organizing these speakers into. but when we changed the instructions, and we said group by gender identity, that kind of collapsed their organization into one dimension along like one rating scale. So it was really simplified in comparison to when we just did general similarity.
Kat Setzer 12:48
Wow, that is really interesting. What conclusions can you draw from these results with regards to gender attribution and speech?
Brooke Merritt 12:54
Sure, well, when we did say, "by general similarity," the two dimensions that we found through multi-dimensional scaling, kind of both used similar acoustic features, but they were used in different ways. So on the first dimension, we saw that listeners seem to organize the speakers from low fundamental frequency and low formant frequencies, up to high fundamental frequency and formant frequencies. And fundamental frequency is the pitch of the voice. So it's kind of a low-to-high rating skill. And that also really correlated pretty strongly with these, the ratings that we asked them to do on masculinity and femininity. So we kind of subjectively labeled that first dimension as masculinity and femininity. And so that was one organizational kind of dimension listeners used. But then we also saw on that other dimension, that they still used the pitch of the voice and the resonance of the vocal tract, but it was used differently. And on that dimension, what we saw is that listeners tended to group speakers who were either at the very low, or the very high end of that continuum of pitch and vocal tract resonances away from people who were more in the middle, or more ambiguous sounding. And so we called that dimension "gender prototypicality." And so it tended to separate speakers who were more cisgender men and women away from speakers who were gender diverse.
Kat Setzer 14:22
Interesting, interesting. Okay. So did you have any other takeaways from this study?
Brooke Merritt 14:26
Sure. So we think that the effect of instruction really did impact how listeners approach this task and what sort of measurements we really got of their cognitive representation of gender. And so because just changing the word and kind of biasing them with the word gender, and that substantially simplified their groupings, it may be the case that a lot of studies that have upfront presented listeners with these rating skills with these labels like masculinity, femininity, man, woman, male, female, may be activating stereotypes they have about gender, more so than actually tapping into their true cognitive organization of these different speakers and how they actually represent gender in their minds.
Kat Setzer 15:10
So what are the next steps in your research?
Brooke Merritt 15:12
Yeah, so, that was, those were cisgender listeners that we just looked at. And so now I'm really curious to see if there might be differences in this representation between cisgender listeners and listeners with other backgrounds. So maybe listeners who are transgender, for instance. I'm also really curious, living in El Paso, on the US Mexico border, we have a really large bilingual population here. So I'm really curious to see for bilinguals, how that representation might change depending on the language that they're using. So when they hear speakers in Spanish, do they evaluate them on those same dimensions, or that same criteria, as they might with a speaker who's using English with them.
Kat Setzer 15:51
Oh, yeah, it sounds like there's a lot of angles to consider in terms of perception. It's really interesting to think about how our understanding of speech perception can be really enhanced by moving out of the gender binary as a means of classifying acoustic cues. Thank you for taking the time to speak with me today, and I hope I get to hear more about your research in the future.
Brooke Merritt 16:12
Thank you so much, Kat. I enjoyed being here.
Kat Setzer 16:16
Thank you for tuning into Across Acoustics. If you'd like to hear more interviews from our authors about their research, subscribe and find us on your preferred podcast platform.