The emergence of human-like abilities of AI systems for content generation in domains such as text, audio, and vision has prompted the development of classifiers to determine whether content originated from a human or a machine. Implicit in these efforts is an assumption that the generation properties of a human are different from that of the machine. In this work, we provide a framework in the language of statistical pattern recognition that quantifies the difference between the distributions of human and machine-generated content conditioned on an evaluation context. We describe current methods in the context of the framework and demonstrate how to use the framework to evaluate the progression of generative models towards human-like capabilities, among many axes of analysis.
翻译:随着人工智能系统在文本、音频和视觉等内容生成领域展现出类人能力,促使人们开发分类器以判断内容来源于人类还是机器。这些研究隐含着一个假设:人类的生成特性与机器存在差异。本研究基于统计模式识别理论框架,在给定评估条件下量化人类生成内容与机器生成内容分布之间的差异。我们在此框架背景下阐述现有方法,并展示如何利用该框架从多个分析维度评估生成模型向类人能力的演进过程。