We introduce a new problem KTRL+F, a knowledge-augmented in-document search task that necessitates real-time identification of all semantic targets within a document with the awareness of external sources through a single natural query. KTRL+F addresses following unique challenges for in-document search: 1)utilizing knowledge outside the document for extended use of additional information about targets, and 2) balancing between real-time applicability with the performance. We analyze various baselines in KTRL+F and find limitations of existing models, such as hallucinations, high latency, or difficulties in leveraging external knowledge. Therefore, we propose a Knowledge-Augmented Phrase Retrieval model that shows a promising balance between speed and performance by simply augmenting external knowledge in phrase embedding. We also conduct a user study to verify whether solving KTRL+F can enhance search experience for users. It demonstrates that even with our simple model, users can reduce the time for searching with less queries and reduced extra visits to other sources for collecting evidence. We encourage the research community to work on KTRL+F to enhance more efficient in-document information access.
翻译:我们提出一个新问题KTRL+F,即知识增强的文档内搜索任务,要求通过单一自然查询实时识别文档内所有语义目标,并感知外部来源信息。KTRL+F针对文档内搜索面临以下独特挑战:1)利用文档外知识以扩展目标相关信息的应用,2)平衡实时可用性与性能。我们分析了KTRL+F中的多种基线方法,发现现有模型存在幻觉、高延迟或难以利用外部知识等局限。为此,我们提出一种知识增强短语检索模型,通过简单地在短语嵌入中融入外部知识,在速度与性能间取得了有前景的平衡。我们还进行了用户研究,验证解决KTRL+F能否提升用户的搜索体验。结果表明,即使采用我们简单的模型,用户也能用更少的查询减少搜索时间,并减少为收集证据而额外访问其他来源的次数。我们鼓励研究社区致力于KTRL+F的研究,以提升更高效的文档内信息访问能力。