Point cloud data is ubiquitous in scientific fields. Recently, geometric deep learning (GDL) has been widely applied to solve prediction tasks with such data. However, GDL models are often complicated and hardly interpretable, which poses concerns to scientists who are to deploy these models in scientific analysis and experiments. This work proposes a general mechanism, learnable randomness injection (LRI), which allows building inherently interpretable models based on general GDL backbones. LRI-induced models, once trained, can detect the points in the point cloud data that carry information indicative of the prediction label. We also propose four datasets from real scientific applications that cover the domains of high-energy physics and biochemistry to evaluate the LRI mechanism. Compared with previous post-hoc interpretation methods, the points detected by LRI align much better and stabler with the ground-truth patterns that have actual scientific meanings. LRI is grounded by the information bottleneck principle, and thus LRI-induced models are also more robust to distribution shifts between training and test scenarios. Our code and datasets are available at \url{https://github.com/Graph-COM/LRI}.
翻译:点云数据在科学领域中无处不在。近年来,几何深度学习(GDL)被广泛用于处理此类数据的预测任务。然而,GDL模型结构复杂且难以解释,这给计划在科学分析和实验中部署这些模型的科学家带来了顾虑。本文提出一种通用机制——可学习随机注入(LRI),该机制能够基于通用GDL主干网络构建具有内在可解释性的模型。经训练后,LRI诱导的模型可检测点云数据中携带预测标签指示信息的点。我们还从高能物理与生物化学领域提出了四个来自真实科学应用的数据集,以评估LRI机制。与以往的"事后解释"方法相比,LRI检测到的点与具有实际科学含义的真实模式具有更好且更稳定的对齐效果。LRI以信息瓶颈原理为理论基础,因此由LRI诱导的模型对训练与测试场景间的分布偏移具有更强的鲁棒性。我们的代码和数据集开源地址为:\url{https://github.com/Graph-COM/LRI}。