This article methodologically reflects on how social media scholars can effectively engage with speech-based data in their analyses. While contemporary media studies have embraced textual, visual, and relational data, the aural dimension remained comparatively under-explored. Building on the notion of secondary orality and rejection towards purely visual culture, the paper argues that considering voice and speech at scale enriches our understanding of multimodal digital content. The paper presents the TikTok Subtitles Toolkit that offers accessible speech processing readily compatible with existing workflows. In doing so, it opens new avenues for large-scale inquiries that blend quantitative insights with qualitative precision. Two illustrative cases highlight both opportunities and limitations of speech research: while genres like #storytime on TikTok benefit from the exploration of spoken narratives, nonverbal or music-driven content may not yield significant insights using speech data. The article encourages researchers to integrate aural exploration thoughtfully to complement existing methods, rather than replacing them. I conclude that the expansion of our methodological repertoire enables richer interpretations of platformised content, and our capacity to unpack digital cultures as they become increasingly multimodal.
翻译:本文从方法论角度反思社交媒体学者如何在分析中有效利用语音数据。尽管当代媒介研究已广泛采纳文本、视觉与关系数据,听觉维度却相对缺乏深入探索。本文基于次生口语性理论及对纯视觉文化的批判立场,论证大规模考察语音与言语能够深化我们对多模态数字内容的理解。文中介绍的TikTok字幕工具包提供了易于整合至现有工作流程的语音处理方案,由此开辟了融合量化洞见与质性精度的大规模研究新路径。两个案例研究揭示了语音研究的机遇与局限:虽然TikTok上#故事时间等叙事类内容适合通过语音分析探索,但非言语或音乐主导的内容可能难以通过语音数据获得显著发现。本文主张研究者应审慎整合听觉维度的探索,以补充而非取代现有方法。最终指出:方法论体系的拓展不仅能深化对平台化内容的阐释,更将增强我们解析日益多模态化的数字文化的能力。