Archive collections are nowadays mostly available through search engines interfaces, which allow a user to retrieve documents by issuing queries. The study of these collections may be, however, impaired by some aspects of search engines, such as the overwhelming number of documents returned or the lack of contextual knowledge provided. New methods that could work independently or in combination with search engines are then required to access these collections. In this position paper, we propose to extend TimeLine Summarization (TLS) methods on archive collections to assist in their studies. We provide an overview of existing TLS methods and we describe a conceptual framework for an Archive TimeLine Summarization (ATLS) system, which aims to generate informative, readable and interpretable timelines.
翻译:档案库目前主要通过搜索引擎界面访问,用户可通过查询检索文档。然而,搜索引擎的某些特性(如返回文档数量过多、缺乏上下文知识)可能阻碍对档案库的研究。因此,需要开发可独立或与搜索引擎协同工作的新方法来访问这些档案。本文作为立场论文,提出将时间线摘要(TLS)方法扩展应用于档案库,以辅助其研究。我们概述了现有TLS方法,并描述了面向档案时间线摘要(ATLS)系统的概念框架,旨在生成信息丰富、可读性强且可解释的时间线。