Voice is a central element of identity. We recognize people by their voice, and we uniquely express who we are with it. For people who rely on augmentative and alternative communication~(AAC) systems, such as speech-generating devices~(SGD), the device's voice becomes an identity marker others associate with them. Yet, it is hard to find a voice that truly aligns with one's identity both linguistically and culturally. Although modern AI-generated voices can reproduce diverse accents and speaking styles, AAC users still lack accessible ways to articulate how they want an identity-aligned voice to sound like. We first conducted a survey of AAC users (across eight countries) to characterize current voice representation, finding that non-binary, transgender, and non-US-born respondents rated their current voice support identity alignment consistently lower than other respondents. To examine how AAC users respond to voices designed to reflect their cultural identity, we built a tool that elicits cultural markers through guided questions and generates personalized voice candidates for participants to hear and reflect on. After participants heard the voices, we interviewed them to examine what it means for a voice to feel culturally representative, how they interpreted voices with cultural connotations, and how these voices shaped their sense of identity and agency. Our findings show that cultural voice alignment runs deeper than accent or language alone; it touches on belonging, self-recognition, and what it means to be heard as who you are.
翻译:声音是身份认同的核心要素。我们通过声音识别他人,也用声音独特地表达自我。对于依赖增强与替代沟通(AAC)系统(如语音生成设备(SGD))的用户而言,设备的声音成为他人关联他们的身份标识。然而,要找到在语言和文化层面真实符合自我身份的声音仍十分困难。尽管现代AI生成声音能复现多样口音与说话风格,AAC用户仍缺乏便捷方式来清晰表达他们想要的身份匹配声音。我们首先对来自八个国家的AAC用户进行调研,以刻画当前声音表征的现状,发现非二元性别、跨性别者及非美国出生的受访者对其当前声音支持身份认同的评分持续低于其他群体。为考察AAC用户如何回应被设计反映其文化身份的声音,我们构建了一款工具:通过引导性问题抽取文化标识,并生成个性化声音候选供参与者试听与反思。在参与者聆听声音后,我们对其进行访谈,以探究声音如何具有文化代表性、用户如何解读带有文化内涵的声音,以及这些声音如何塑造他们的身份与自主感。研究结果表明,文化声音的匹配远不止于口音或语言本身,它触及归属感、自我认同,以及被他人倾听时如何呈现真实自我。