时间指称一致性：大语言模型是否更偏好序列而非绝对时间参照？ (Temporal Referential Consistency: Do LLMs Favor Sequences Over Absolute Time References?)

The increasing acceptance of large language models (LLMs) as an alternative to knowledge sources marks a significant paradigm shift across various domains, including time-sensitive fields such as law, healthcare, and finance. To fulfill this expanded role, LLMs must not only be factually accurate but also demonstrate consistency across temporal dimensions, necessitating robust temporal reasoning capabilities. Despite this critical requirement, efforts to ensure temporal consistency in LLMs remain scarce including noticeable absence of endeavors aimed at evaluating or augmenting LLMs across temporal references in time-sensitive inquiries. In this paper, we seek to address this gap by introducing a novel benchmark entitled temporal referential consistency, accompanied by a resource TEMP-ReCon designed to benchmark a wide range of both open-source and closed-source LLMs with various linguistic contexts characterized by differing resource richness (including English, French, and Romanian). The findings emphasis that LLMs do exhibit insufficient temporal referent consistency. To address this, we propose \newmodel, a reasoning path alignment-based model that aims to enhance the temporal referential consistency of LLMs. Our empirical experiments substantiate the efficacy of UnTRaP compared to several baseline models.

翻译：大型语言模型（LLM）日益被接受为知识源的替代方案，这标志着包括法律、医疗和金融等时效敏感领域在内的多个领域发生了重大的范式转变。为胜任这一扩展角色，LLM不仅需要事实准确，还必须在时间维度上保持一致性，这要求其具备稳健的时间推理能力。尽管这一要求至关重要，但确保LLM时间一致性的努力仍然匮乏，包括在时效性查询中评估或增强LLM跨时间参照能力的工作明显缺失。本文旨在填补这一空白，引入一个名为时间指称一致性的新颖基准，并配套一个资源TEMP-ReCon，该资源旨在对广泛的开源和闭源LLM进行基准测试，涵盖具有不同资源丰富度特征（包括英语、法语和罗马尼亚语）的多种语言语境。研究结果强调，LLM确实表现出时间指称一致性不足的问题。为解决此问题，我们提出\newmodel，一种基于推理路径对齐的模型，旨在增强LLM的时间指称一致性。我们的实证实验证实了UnTRaP相较于多个基线模型的有效性。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

【亚马逊-WWW2020】不解析,生成!用于面向任务的语义分析的序列到序列体系结构，Don't Parse, Generate! A Sequence to Sequence Architecture for Task-Oriented Semantic Parsing

专知会员服务

15+阅读 · 2020年2月1日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日