In order to reveal the rationale behind model predictions, many works have exploited providing explanations in various forms. Recently, to further guarantee readability, more and more works turn to generate sentence-level human language explanations. However, current works pursuing sentence-level explanations rely heavily on annotated training data, which limits the development of interpretability to only a few tasks. As far as we know, this paper is the first to explore this problem smoothly from weak-supervised learning to unsupervised learning. Besides, we also notice the high latency of autoregressive sentence-level explanation generation, which leads to asynchronous interpretability after prediction. Therefore, we propose a non-autoregressive interpretable model to facilitate parallel explanation generation and simultaneous prediction. Through extensive experiments on Natural Language Inference task and Spouse Prediction task, we find that users are able to train classifiers with comparable performance $10-15\times$ faster with parallel explanation generation using only a few or no annotated training data.
翻译:为了揭示模型预测背后的逻辑,许多研究致力于以多种形式提供解释。近年来,为进一步确保可读性,越来越多的工作转向生成句子级的人类语言解释。然而,当前追求句子级解释的研究严重依赖标注训练数据,这使得可解释性仅局限于少数任务。据我们所知,本文首次从弱监督学习平滑过渡到无监督学习来探索该问题。此外,我们还注意到自回归句子级解释生成带来的高延迟,导致解释与预测结果异步。因此,我们提出一种非自回归可解释模型,以促进并行解释生成与同步预测。通过在自然语言推理任务和配偶预测任务上的大量实验,我们发现用户仅需少量或无需标注训练数据,即可通过并行解释生成实现分类器性能相当且训练速度提升10-15倍。