Uncertainty approximation in text classification is an important area with applications in domain adaptation and interpretability. One of the most widely used uncertainty approximation methods is Monte Carlo (MC) Dropout, which is computationally expensive as it requires multiple forward passes through the model. A cheaper alternative is to simply use the softmax based on a single forward pass without dropout to estimate model uncertainty. However, prior work has indicated that these predictions tend to be overconfident. In this paper, we perform a thorough empirical analysis of these methods on five datasets with two base neural architectures in order to identify the trade-offs between the two. We compare both softmax and an efficient version of MC Dropout on their uncertainty approximations and downstream text classification performance, while weighing their runtime (cost) against performance (benefit). We find that, while MC dropout produces the best uncertainty approximations, using a simple softmax leads to competitive and in some cases better uncertainty estimation for text classification at a much lower computational cost, suggesting that softmax can in fact be a sufficient uncertainty estimate when computational resources are a concern.
翻译:文本分类中的不确定度近似是一个重要领域,在领域适应和可解释性方面具有应用价值。最广泛使用的不确定度近似方法之一是蒙特卡洛(MC)Dropout,但该方法计算成本高昂,因为需要通过模型进行多次前向传播。一种更经济的替代方案是直接采用基于单次前向传播(不含Dropout)的softmax来估计模型不确定度。然而,先前研究表明此类预测往往过于自信。本文在五个数据集上使用两种基础神经网络架构对这两种方法进行了详尽的实证分析,以识别它们之间的权衡关系。我们比较了softmax和一种高效版本的MC Dropout在不确定度近似与下游文本分类性能方面的表现,同时权衡了它们的运行时间(成本)与性能(收益)。结果发现,尽管MC Dropout能产生最佳的不确定度近似,但使用简单的softmax在文本分类中能以低得多的计算成本获得具有竞争力甚至在某些情况下更优的不确定度估计,这表明当计算资源受限时,softmax实际上可作为足够的不确定度估计方法。