This paper surveys the current state of the art in document automation (DA). The objective of DA is to reduce the manual effort during the generation of documents by automatically creating and integrating input from different sources and assembling documents conforming to defined templates. There have been reviews of commercial solutions of DA, particularly in the legal domain, but to date there has been no comprehensive review of the academic research on DA architectures and technologies. The current survey of DA reviews the academic literature and provides a clearer definition and characterization of DA and its features, identifies state-of-the-art DA architectures and technologies in academic research, and provides ideas that can lead to new research opportunities within the DA field in light of recent advances in generative AI and large language models.
翻译:本文对文档自动化(DA)的最新研究现状进行了综述。DA的目标是通过自动创建和整合来自不同来源的输入信息,并按照定义的模板组装文档,从而减少文档生成过程中的人工投入。虽然已有关于商业DA解决方案的综述(特别是在法律领域),但迄今尚缺乏对DA架构与技术的学术研究进行全面综述。当前针对DA的综述梳理了学术文献,给出了更清晰的DA定义、特征描述及功能界定,明确了学术研究中主流的DA架构与技术,并结合生成式AI与大语言模型的最新进展,提出了DA领域中可催生新研究机会的思路。