Language Models are being widely used in Education. Even though modern deep learning models achieve very good performance on question-answering tasks, sometimes they make errors. To avoid misleading students by showing wrong answers, it is important to calibrate the confidence - that is, the prediction probability - of these models. In our work, we propose to use an XGBoost on top of BERT to output the corrected probabilities, using features based on the attention mechanism. Our hypothesis is that the level of uncertainty contained in the flow of attention is related to the quality of the model's response itself.
翻译:语言模型正被广泛应用于教育领域。尽管现代深度学习模型在问答任务中表现出色,但有时仍会犯错。为避免因显示错误答案而误导学生,校准这些模型的置信度(即预测概率)至关重要。在本研究中,我们提出在BERT模型之上叠加XGBoost,利用基于注意力机制的特征来输出修正后的概率。我们的假设是:注意力流中包含的不确定性水平与模型自身响应的质量相关。