Designing an explainable model becomes crucial now for Natural Language Processing(NLP) since most of the state-of-the-art machine learning models provide a limited explanation for the prediction. In the spectrum of an explainable model, Tsetlin Machine(TM) is promising because of its capability of providing word-level explanation using proposition logic. However, concern rises over the elaborated combination of literals (propositional logic) in the clause that makes the model difficult for humans to comprehend, despite having a transparent learning process. In this paper, we design a post-hoc pruning of clauses that eliminate the randomly placed literals in the clause thereby making the model more efficiently interpretable than the vanilla TM. Experiments on the publicly available YELP-HAT Dataset demonstrate that the proposed pruned TM's attention map aligns more with the human attention map than the vanilla TM's attention map. In addition, the pairwise similarity measure also surpasses the attention map-based neural network models. In terms of accuracy, the proposed pruning method does not degrade the accuracy significantly but rather enhances the performance up to 4% to 9% in some test data.
翻译:设计可解释模型对于自然语言处理(NLP)而言已变得至关重要,因为当前大多数最先进的机器学习模型对其预测提供的解释有限。在可解释模型领域中,Tsetlin Machine(TM)因其能够使用命题逻辑提供字级解释而展现出前景。然而,尽管TM具有透明的学习过程,但其子句中文字(命题逻辑)的复杂组合引发了担忧,这使得模型难以被人类理解。在本文中,我们设计了一种对子句进行事后剪枝的方法,该方法消除了子句中随机放置的文字,从而使模型比原始TM更高效地可解释。在公开可用的YELP-HAT数据集上的实验表明,所提出的剪枝TM的注意力图比原始TM的注意力图更符合人类注意力图。此外,其成对相似性度量也超越了基于注意力图的神经网络模型。在准确率方面,所提出的剪枝方法不会显著降低准确率,反而在某些测试数据上将性能提升了4%至9%。