Deep learning models have achieved remarkable success across various domains, yet their learned representations and decision-making processes remain largely opaque and hard to interpret. This work introduces HOLE (Homological Observation of Latent Embeddings), a method for analyzing and interpreting discriminative neural networks through persistent homology. HOLE extracts topological features from intermediate activations and presents them using a suite of visualization techniques, including cluster flow diagrams, blob graphs, and heatmap dendrograms. These tools facilitate the examination of representation structure and quality across layers. We evaluate HOLE using a range of discriminative models, focusing on representation quality, interpretability across layers, and robustness to input perturbations and model compression. The results indicate that topological analysis reveals patterns associated with class separation, feature disentanglement, and model robustness, providing a complementary perspective for understanding and improving deep learning systems.
翻译:深度学习模型在多个领域取得了显著成功,但其学习到的表示和决策过程在很大程度上仍不透明且难以解释。本文提出HOLE(潜在嵌入同调观察)方法,通过持久同调分析并解释判别式神经网络。HOLE从中间层激活中提取拓扑特征,并利用一组可视化技术(包括簇流图、团块图和热力图树状图)进行呈现。这些工具有助于跨层检查表示的结构与质量。我们使用多种判别式模型对HOLE进行评估,重点关注表示质量、跨层可解释性以及对输入扰动和模型压缩的鲁棒性。结果表明,拓扑分析揭示了与类别分离、特征解耦及模型鲁棒性相关的模式,为理解和改进深度学习系统提供了互补性视角。