Drawing parallels with the way biological networks are studied, we adapt the treatment--control paradigm to explainable artificial intelligence research and enrich it through multi-parametric input alterations. In this study, we propose a framework for investigating the internal inference impacted by input data augmentations. The internal changes in network operation are reflected in activation changes measured by variance, which can be decomposed into components related to each augmentation, employing Sobol indices and Shapley values. These quantities enable one to visualize sensitivity to different variables and use them for guided masking of activations. In addition, we introduce a way of single-class sensitivity analysis where the candidates are filtered according to their matching to prediction bias generated by targeted damaging of the activations. Relying on the observed parallels, we assume that the developed framework can potentially be transferred to studying biological neural networks in complex environments.
翻译:借鉴生物网络的研究方法,我们将处理-对照范式应用于可解释人工智能研究,并通过多参数输入变换加以丰富。本研究提出一个框架,用于探究受输入数据增强影响的内部推理过程。网络运行的内部变化通过方差衡量的激活变化来反映,这些方差可借助Sobol指数和Shapley值分解为与各增强操作相关的分量。这些量化指标能够可视化网络对不同变量的敏感性,并用于指导激活掩码的生成。此外,我们提出一种单类别敏感性分析方法,通过候选变量与针对性激活损伤所产生的预测偏差的匹配度进行筛选。基于观察到的相似性,我们认为所开发的框架有望迁移至复杂环境下的生物神经网络研究。