In this paper, we analyze the internal representations of a general-purpose audio self-supervised learning (SSL) model from a neuron-level perspective. Despite their strong empirical performance as feature extractors, the internal mechanisms underlying the robust generalization of SSL audio models remain unclear. Drawing on the framework of mechanistic interpretability, we identify and examine class-specific neurons by analyzing conditional activation patterns across diverse tasks. Our analysis reveals that SSL models foster the emergence of class-specific neurons that provide extensive coverage across novel task classes. These neurons exhibit shared responses across different semantic categories and acoustic similarities, such as speech attributes and musical pitch. We also confirm that these neurons have a functional impact on classification performance. To our knowledge, this is the first systematic neuron-level analysis of a general-purpose audio SSL model, providing new insights into its internal representation.
翻译:本文从神经元层面分析了一种通用音频自监督学习(SSL)模型的内部表征。尽管此类模型作为特征提取器展现出强大的实证性能,但其实现稳健泛化的内部机制仍不明确。借鉴机制可解释性研究框架,我们通过分析跨多样任务的条件激活模式,识别并考察了类别特异性神经元。分析表明,SSL模型促进了类别特异性神经元的涌现,这些神经元能够广泛覆盖新任务类别。这些神经元在不同语义类别及声学相似性(如语音属性与音乐音高)上表现出共享响应模式。我们进一步证实了这些神经元对分类性能具有功能性影响。据我们所知,这是首次对通用音频SSL模型进行的系统性神经元层面分析,为理解其内部表征提供了新的见解。