Scientists and science journalists, among others, often need to make sense of a large number of papers and how they compare with each other in scope, focus, findings, or any other important factors. However, with a large corpus of papers, it's cognitively demanding to pairwise compare and contrast them all with each other. Fully automating this review process would be infeasible, because it often requires domain-specific knowledge, as well as understanding what the context and motivations for the review are. While there are existing tools to help with the process of organizing and annotating papers for literature reviews, at the core they still rely on people to serially read through papers and manually make sense of relevant information. We present AVTALER, which combines peoples' unique skills, contextual awareness, and knowledge, together with the strength of automation. Given a set of comparable text excerpts from a paper corpus, it supports users in sensemaking and contrasting paper attributes by interactively aligning text excerpts in a table so that comparable details are presented in a shared column. AVTALER is based on a core alignment algorithm that makes use of modern NLP tools. Furthermore, AVTALER is a mixed-initiative system: users can interactively give the system constraints which are integrated into the alignment construction process.
翻译:科学家和科学记者等人群经常需要理解大量论文,并分析它们在范围、焦点、发现或其他重要因素上的相互比较。然而,面对大型论文语料库,对它们进行两两比较和对比在认知上要求很高。完全自动化这一审查过程并不现实,因为它通常需要领域特定知识,以及理解审查的背景和动机。尽管现有一些工具可帮助组织论文和标注文献综述过程,但其核心仍依赖于人们逐一通读论文并手动理解相关信息。我们提出AVTALER系统,它结合了人类的独特技能、情境意识和知识,以及自动化的优势。给定论文语料库中一组可比较的文本片段,系统通过交互式地将文本片段对齐至表格中,使得可比较的细节呈现于共享列中,从而支持用户进行意义建构和论文属性对比。AVTALER基于一个利用现代NLP工具的核心对齐算法。此外,AVTALER是一个混合主动系统:用户可交互式地向系统提供约束条件,这些约束条件将被整合到对齐构建过程中。