The current trend in developing machine learning models for reading comprehension and logical reasoning tasks is focused on improving the models' abilities to understand and utilize logical rules. This work focuses on providing a novel loss function and accompanying model architecture that has more interpretable components than some other models by representing a common strategy employed by humans when given reading comprehension and logical reasoning tasks. This strategy involves emphasizing relative accuracy over absolute accuracy and can theoretically produce the correct answer without full knowledge of the information required to solve the question. We examine the effectiveness of applying such a strategy to train transfer learning models to solve reading comprehension and logical reasoning questions. The models were evaluated on the ReClor dataset, a challenging reading comprehension and logical reasoning benchmark. We propose the polytuplet loss function, an extension of the triplet loss function, to ensure prioritization of learning the relative correctness of answer choices over learning the true accuracy of each choice. Our results indicate that models employing polytuplet loss outperform existing baseline models. Although polytuplet loss is a promising alternative to other contrastive loss functions, further research is required to quantify the benefits it may present.
翻译:当前,针对阅读理解与逻辑推理任务的机器学习模型开发趋势,聚焦于提升模型理解与运用逻辑规则的能力。本研究提出了一种新型损失函数及其配套模型架构,通过复现人类在阅读理解与逻辑推理任务中的常用策略,相较于其他模型具有更强的可解释性组件。该策略强调相对准确性优先于绝对准确性,即便未完全掌握解题所需信息,理论上仍能推导出正确答案。我们验证了运用该策略训练迁移学习模型解决阅读理解与逻辑推理问题的有效性。模型在具有挑战性的阅读理解与逻辑推理基准数据集ReClor上进行了评估。我们提出多项式组损失函数——作为三元组损失函数的扩展,确保模型优先学习答案选项间的相对正确性,而非每个选项的真实准确性。实验结果表明,采用多项式组损失的模型性能优于现有基线模型。尽管多项式组损失作为对比损失函数的替代方案颇具前景,但其实际效益仍需进一步量化研究。