We propose a novel interpretable deep neural network for text classification, called ProtoryNet, based on a new concept of prototype trajectories. Motivated by the prototype theory in modern linguistics, ProtoryNet makes a prediction by finding the most similar prototype for each sentence in a text sequence and feeding an RNN backbone with the proximity of each sentence to the corresponding active prototype. The RNN backbone then captures the temporal pattern of the prototypes, which we refer to as prototype trajectories. Prototype trajectories enable intuitive and fine-grained interpretation of the reasoning process of the RNN model, in resemblance to how humans analyze texts. We also design a prototype pruning procedure to reduce the total number of prototypes used by the model for better interpretability. Experiments on multiple public data sets show that ProtoryNet is more accurate than the baseline prototype-based deep neural net and reduces the performance gap compared to state-of-the-art black-box models. In addition, after prototype pruning, the resulting ProtoryNet models only need less than or around 20 prototypes for all datasets, which significantly benefits interpretability. Furthermore, we report a survey result indicating that human users find ProtoryNet more intuitive and easier to understand than other prototype-based methods.
翻译:我们提出了一种名为ProtoryNet的新型可解释深度神经网络,用于文本分类,其核心基于原型轨迹这一新概念。受现代语言学中原型理论的启发,ProtoryNet通过为文本序列中的每个句子寻找最相似的原型,并将每个句子与对应活跃原型的接近度输入RNN主干网络来进行预测。RNN主干网络随后捕捉这些原型的时序模式,我们称之为原型轨迹。原型轨迹使RNN模型的推理过程能够以直观且细粒度的方式被解释,类似于人类分析文本的方式。我们还设计了一种原型剪枝程序,以减少模型使用的原型总数,从而提升可解释性。在多个公共数据集上的实验表明,ProtoryNet比基于原型的基线深度神经网络更准确,并且缩小了与最先进黑盒模型之间的性能差距。此外,经过原型剪枝后,所有数据集上最终得到的ProtoryNet模型仅需少于或约20个原型,这显著提高了可解释性。最后,我们报告了一项调查结果,表明人类用户认为ProtoryNet比其他基于原型的方法更直观且更易于理解。