The CrisisFACTS Track aims to tackle challenges such as multi-stream fact-finding in the domain of event tracking; participants' systems extract important facts from several disaster-related events while incorporating the temporal order. We propose a combination of retrieval, reranking, and the well-known Integer Linear Programming (ILP) and Maximal Marginal Relevance (MMR) frameworks. In the former two modules, we explore various methods including an entity-based baseline, pre-trained and fine-tuned Question Answering systems, and ColBERT. We then use the latter module as an extractive summarization component by taking diversity and novelty criteria into account. The automatic scoring runs show strong results across the evaluation setups but also reveal shortcomings and challenges.
翻译:CrisisFACTS赛道旨在应对事件追踪领域中的多流事实发现等挑战;参与者系统需从多个灾害相关事件中提取重要事实,同时考虑时间顺序。我们提出了一种结合检索、重排序以及经典的整数线性规划(ILP)和最大边际相关性(MMR)框架的方案。在前两个模块中,我们探索了多种方法,包括基于实体的基线模型、预训练及微调的问答系统以及ColBERT。随后,我们利用后一模块作为抽取式摘要组件,引入多样性与新颖性准则。自动评分结果显示,该方案在各类评估设置中表现强劲,但也揭示了若干缺陷与挑战。