In artificial-intelligence-aided signal processing, existing deep learning models often exhibit a black-box structure, and their validity and comprehensibility remain elusive. The integration of topological methods, despite its relatively nascent application, serves a dual purpose of making models more interpretable as well as extracting structural information from time-dependent data for smarter learning. Here, we provide a transparent and broadly applicable methodology, TopCap, to capture the most salient topological features inherent in time series for machine learning. Rooted in high-dimensional ambient spaces, TopCap is capable of capturing features rarely detected in datasets with low intrinsic dimensionality. Applying time-delay embedding and persistent homology, we obtain descriptors which encapsulate information such as the vibration of a time series, in terms of its variability of frequency, amplitude, and average line, demonstrated with simulated data. This information is then vectorised and fed into multiple machine learning algorithms such as k-nearest neighbours and support vector machine. Notably, in classifying voiced and voiceless consonants, TopCap achieves an accuracy exceeding 96% and is geared towards designing topological convolutional layers for deep learning of speech and audio signals.
翻译:在人工智能辅助信号处理中,现有深度学习模型常呈现黑箱结构,其有效性与可解释性仍难以捉摸。拓扑方法的引入虽尚处于应用初期,却具有双重作用:既提升模型可解释性,又能从时间依赖数据中提取结构信息以实现更智能的学习。本文提出一种透明且广泛适用的方法论TopCap,旨在捕捉时间序列中最显著且适用于机器学习的拓扑特征。基于高维环境空间,TopCap能够捕捉低内在维度数据集中罕见检测的特征。通过应用时滞嵌入与持续同调,我们获得描述子,这些描述子以频率、振幅和均线的变异性形式封装时间序列的振动信息(通过模拟数据验证)。随后对该信息进行向量化处理,并输入k近邻与支持向量机等多种机器学习算法。值得注意的是,在对清辅音与浊辅音进行分类时,TopCap准确率超过96%,且旨在为语音与音频信号的深度学习设计拓扑卷积层。