The so-called black-box deep learning (DL) models are increasingly used in classification tasks across many scientific disciplines, including wireless communications domain. In this trend, supervised DL models appear as most commonly proposed solutions to domain-related classification problems. Although they are proven to have unmatched performance, the necessity for large labeled training data and their intractable reasoning, as two major drawbacks, are constraining their usage. The self-supervised architectures emerged as a promising solution that reduces the size of the needed labeled data, but the explainability problem remains. In this paper, we propose a methodology for explaining deep clustering, self-supervised learning architectures comprised of a representation learning part based on a Convolutional Neural Network (CNN) and a clustering part. For the state of the art representation learning part, our methodology employs Guided Backpropagation to interpret the regions of interest of the input data. For the clustering part, the methodology relies on Shallow Trees to explain the clustering result using optimized depth decision tree. Finally, a data-specific visualizations part enables connection for each of the clusters to the input data trough the relevant features. We explain on a use case of wireless spectrum activity clustering how the CNN-based, deep clustering architecture reasons.
翻译:所谓的黑箱深度学习模型正越来越多地被用于众多科学学科(包括无线通信领域)的分类任务中。在此趋势下,监督式深度学习模型成为领域相关分类问题中最常提出的解决方案。尽管它们被证明具有无与伦比的性能,但两大主要缺陷——对大规模标注训练数据的需求及其难以理解的推理机制——限制了其应用。自监督架构作为一种有前景的解决方案应运而生,这类架构能减少所需标注数据的规模,但可解释性问题依然存在。本文提出一种用于解释深度聚类的方法,该方法是自监督学习架构,包含基于卷积神经网络的表示学习部分和聚类部分。针对当前最先进的表示学习部分,我们的方法采用引导反向传播来解读输入数据的感兴趣区域。针对聚类部分,该方法依赖浅层树,通过优化深度决策树来解释聚类结果。最后,一个针对数据特性的可视化部分能够通过相关特征将每个聚类与输入数据建立联系。我们以无线频谱活动聚类用例为例,解释了基于卷积神经网络的深度聚类架构的推理机制。