The need for grounding in language understanding is an active research topic. Previous work has suggested that color perception and color language appear as a suitable test bed to empirically study the problem, given its cognitive significance and showing that there is considerable alignment between a defined color space and the feature space defined by a language model. To further study this issue, we collect a large scale source of colors and their descriptions, containing almost a 1 million examples , and perform an empirical analysis to compare two kinds of alignments: (i) inter-space, by learning a mapping between embedding space and color space, and (ii) intra-space, by means of prompting comparatives between color descriptions. Our results show that while color space alignment holds for monolexemic, highly pragmatic color descriptions, this alignment drops considerably in the presence of examples that exhibit elements of real linguistic usage such as subjectivity and abstractedness, suggesting that grounding may be required in such cases.
翻译:语言理解是否需要具身化基础是一个活跃的研究课题。先前研究表明,鉴于颜色感知的认知重要性,且已显示定义的颜色空间与语言模型定义的特征空间之间存在显著对齐,颜色感知与颜色语言可作为研究该问题的合适实验平台。为深入探究此问题,我们收集了包含近百万样本的大规模颜色及其描述数据源,并进行实证分析以比较两类对齐:(i) 空间间对齐,通过学习嵌入空间与颜色空间之间的映射;(ii) 空间内对齐,通过颜色描述之间的比较提示方法。结果表明,虽然颜色空间对齐在单词素、高度实用性的颜色描述中成立,但当样本包含真实语言使用要素(如主观性和抽象性)时,这种对齐显著下降,表明在此类情况下可能需要具身化基础。