Measuring the distance between machine-produced and human language is a critical open problem. Inspired by empirical findings from psycholinguistics on the periodicity of entropy in language, we propose FACE, a set of metrics based on Fourier Analysis of the estimated Cross-Entropy of language, for measuring the similarity between model-generated and human-written languages. Based on an open-ended generation task and the experimental data from previous studies, we find that FACE can effectively identify the human-model gap, scales with model size, reflects the outcomes of different sampling methods for decoding, correlates well with other evaluation metrics and with human judgment scores.
翻译:衡量机器生成语言与人类语言之间的距离仍是一个关键开放性问题。受心理语言学关于语言熵周期性的实证研究启发,我们提出FACE——一组基于语言估计交叉熵傅里叶分析的度量指标,用于衡量模型生成语言与人类书写语言之间的相似度。基于开放式生成任务及先前研究实验数据,我们发现:FACE能有效识别人类与模型之间的差距,随模型规模扩展而呈现规模效应,能反映不同解码采样方法的结果差异,与其他评估指标及人类评分均具有良好相关性。