This paper delves into the pioneering exploration of potential communication patterns within dog vocalizations and transcends traditional linguistic analysis barriers, which heavily relies on human priori knowledge on limited datasets to find sound units in dog vocalization. We present a self-supervised approach with HuBERT, enabling the accurate classification of phoneme labels and the identification of vocal patterns that suggest a rudimentary vocabulary within dog vocalizations. Our findings indicate a significant acoustic consistency in these identified canine vocabulary, covering the entirety of observed dog vocalization sequences. We further develop a web-based dog vocalization labeling system. This system can highlight phoneme n-grams, present in the vocabulary, in the dog audio uploaded by users.
翻译:本文深入探索犬类叫声中潜在交流模式的开拓性研究,突破了传统语言分析壁垒——传统方法过度依赖人类先验知识及有限数据集来寻找犬类叫声中的声音单元。我们提出了一种基于HuBERT的自监督方法,能够准确分类音素标签并识别出犬类叫声中暗示原始词汇的发音模式。研究结果表明,所识别的犬类词汇在整个观测到的犬类叫声序列中表现出显著的声学一致性。我们进一步开发了基于网页的犬类叫声标注系统,该系统可高亮用户上传的犬类音频中存在于词汇库中的音素n-gram。