Measuring the distance between machine-produced and human language is a critical open problem. Inspired by empirical findings from psycholinguistics on the periodicity of entropy in language, we propose FACE, a set of metrics based on Fourier Analysis of the estimated Cross-Entropy of language, for measuring the similarity between model-generated and human-written languages. Based on an open-ended generation task and the experimental data from previous studies, we find that FACE can effectively identify the human-model gap, scales with model size, reflects the outcomes of different sampling methods for decoding, correlates well with other evaluation metrics and with human judgment scores. FACE is computationally efficient and provides intuitive interpretations.
翻译:衡量机器生成语言与人类语言之间的距离是一个关键的开放性问题。受心理语言学中关于语言熵周期性的实证研究启发,我们提出FACE——一组基于语言估计交叉熵傅里叶分析的度量指标,用于衡量模型生成文本与人类撰写文本之间的相似性。基于开放式生成任务及前人研究的实验数据,我们发现FACE能够有效识别人类与模型之间的差距,随模型规模扩展而变化,反映不同解码采样方法的效果,与其他评估指标及人工评分表现出良好相关性。FACE计算高效且提供直观的解释。