Time is one of the crucial factors in real-world question answering (QA) problems. However, language models have difficulty understanding the relationships between time specifiers, such as 'after' and 'before', and numbers, since existing QA datasets do not include sufficient time expressions. To address this issue, we propose a Time-Context aware Question Answering (TCQA) framework. We suggest a Time-Context dependent Span Extraction (TCSE) task, and build a time-context dependent data generation framework for model training. Moreover, we present a metric to evaluate the time awareness of the QA model using TCSE. The TCSE task consists of a question and four sentence candidates classified as correct or incorrect based on time and context. The model is trained to extract the answer span from the sentence that is both correct in time and context. The model trained with TCQA outperforms baseline models up to 8.5 of the F1-score in the TimeQA dataset. Our dataset and code are available at https://github.com/sonjbin/TCQA
翻译:时间是现实世界问答(QA)问题中的关键因素之一。然而,由于现有问答数据集未包含足够的时间表达,语言模型难以理解时间指示词(如"之后"和"之前")与数字之间的关系。为解决此问题,我们提出一种时间上下文感知问答(TCQA)框架。我们设计了一个时间上下文相关的跨度提取(TCSE)任务,并构建了一个用于模型训练的时间上下文相关数据生成框架。此外,我们提出了一种利用TCSE评估问答模型时间感知能力的指标。TCSE任务包含一个问题及四个候选句子,根据时间和上下文将其分类为正确或错误。模型被训练用以从在时间和上下文上均正确的句子中提取答案跨度。通过TCQA训练的模型在TimeQA数据集上相比基线模型的F1分数最高提升了8.5。我们的数据集和代码已开源至 https://github.com/sonjbin/TCQA。