In this paper, we present a submission to the Touche lab's Task 2 on Argument Retrieval for Comparative Questions. Our team Katana supplies several approaches based on decision tree ensembles algorithms to rank comparative documents in accordance with their relevance and argumentative support. We use PyTerrier library to apply ensembles models to a ranking problem, considering statistical text features and features based on comparative structures. We also employ large contextualized language modelling techniques, such as BERT, to solve the proposed ranking task. To merge this technique with ranking modelling, we leverage neural ranking library OpenNIR. Our systems substantially outperforming the proposed baseline and scored first in relevance and second in quality according to the official metrics of the competition (for measure NDCG@5 score). Presented models could help to improve the performance of processing comparative queries in information retrieval and dialogue systems.
翻译:本文介绍了我们在Touche实验室任务2(比较问题论据检索)中的投稿方案。我们的Katana团队提出了基于决策树集成算法的多种方法,根据文档相关性和论证支持度对比较文档进行排序。我们利用PyTerrier库将集成模型应用于排序问题,同时考虑了统计文本特征和基于比较结构的特征。此外,我们还采用了大规模上下文语言建模技术(如BERT)来解决该排序任务。为将该技术与排序建模相结合,我们借助了神经排序库OpenNIR。我们的系统显著超越了基准方法,并根据比赛官方指标(以NDCG@5分数衡量)在相关度排名中位列第一,在质量排名中位列第二。所提出的模型有助于提升信息检索和对话系统中比较查询的处理性能。