KunquDB: An Attempt for Speaker Verification in the Chinese Opera Scenario

This work aims to promote Chinese opera research in both musical and speech domains, with a primary focus on overcoming the data limitations. We introduce KunquDB, a relatively large-scale, well-annotated audio-visual dataset comprising 339 speakers and 128 hours of content. Originating from the Kunqu Opera Art Canon (Kunqu yishu dadian), KunquDB is meticulously structured by dialogue lines, providing explicit annotations including character names, speaker names, gender information, vocal manner classifications, and accompanied by preliminary text transcriptions. KunquDB provides a versatile foundation for role-centric acoustic studies and advancements in speech-related research, including Automatic Speaker Verification (ASV). Beyond enriching opera research, this dataset bridges the gap between artistic expression and technological innovation. Pioneering the exploration of ASV in Chinese opera, we construct four test trials considering two distinct vocal manners in opera voices: stage speech (ST) and singing (S). Implementing domain adaptation methods effectively mitigates domain mismatches induced by these vocal manner variations while there is still room for further improvement as a benchmark.

翻译：本研究旨在推动中国戏曲在音乐与语音领域的研究，重点克服数据资源匮乏的挑战。我们提出KunquDB——一个规模较大、标注完善的音视频数据集，包含339位说话人及128小时内容。该数据集源自《昆曲艺术大典》，按唱词段落精心组织，提供明确的注释信息，包括角色名称、说话人姓名、性别信息、发声方式分类，并附有初步文本转录。KunquDB为以角色为中心的声学研究以及语音相关领域的进展（包括自动说话人验证ASV）提供了通用基础。除了丰富戏曲研究外，该数据集还架起了艺术表达与技术创新之间的桥梁。作为中国戏曲ASV领域的开创性探索，我们针对戏曲嗓音中两种截然不同的发声方式（舞台念白ST与演唱S）构建了四组测试任务。实验证明，域自适应方法能有效缓解发声方式差异带来的域不匹配问题，但作为基准测试仍有进一步改进的空间。

相关内容

数据集

关注 88

数据集，又称为资料集、数据集合或资料集合，是一种由数据所组成的集合。
Data set（或dataset）是一个数据的集合，通常以表格形式出现。每一列代表一个特定变量。每一行都对应于某一成员的数据集的问题。它列出的价值观为每一个变量，如身高和体重的一个物体或价值的随机数。每个数值被称为数据资料。对应于行数，该数据集的数据可能包括一个或多个成员。

【CVPR 2022】基于元内存传输的跨域少镜头语义分割，Remember the Difference: Cross-Domain Few-Shot Semantic Segmentation via Meta-Memory Transfer

专知会员服务

14+阅读 · 2022年3月12日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

35+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日