Words in a natural language not only transmit information but also evolve with the development of civilization and human migration. The same is true for music. To understand the complex structure behind the music, we introduced an algorithm called the Essential Element Network (EEN) to encode the audio into text. The network is obtained by calculating the correlations between scales, time, and volume. Optimizing EEN to generate Zipfs law for the frequency and rank of the clustering coefficient enables us to generate and regard the semantic relationships as words. We map these encoded words into the scale-temporal space, which helps us organize systematically the syntax in the deep structure of music. Our algorithm provides precise descriptions of the complex network behind the music, as opposed to the black-box nature of other deep learning approaches. As a result, the experience and properties accumulated through these processes can offer not only a new approach to the applications of Natural Language Processing (NLP) but also an easier and more objective way to analyze the evolution and development of music.
翻译:自然语言中的词汇不仅传递信息,而且随着文明的发展和人类的迁徙而演化。音乐亦是如此。为了理解音乐背后的复杂结构,我们引入了一种名为"基本元素网络"(Essential Element Network, EEN)的算法,将音频编码为文本。该网络通过计算音阶、时间与音量之间的相关性得出。通过优化EEN使其聚类系数的频率与排名符合齐普夫定律,我们能够生成语义关系并将其视为词汇。我们将这些编码后的词汇映射到尺度-时间空间中,这有助于我们系统地组织音乐深层结构中的语法。与其它深度学习方法存在的黑箱性质不同,我们的算法提供了对音乐背后复杂网络的精确描述。因此,通过这些过程积累的经验与特性,不仅能为其在自然语言处理(NLP)领域的应用提供新思路,还能为分析音乐的演化与发展提供更简便、更客观的方法。