Substantial research on deep learning-based emergent communication uses the referential game framework, specifically the Lewis signaling game, however we argue that successful communication in this game typically only need one or two symbols for target image classification because of a sampling pitfall in the training data. To address this issue, we provide a theoretical analysis and introduce a combinatorial algorithm SolveMinSym (SMS) to solve the symbolic complexity for classification, which is the minimum number of symbols in the message for successful communication. We use the SMS algorithm to create datasets with different symbolic complexity to empirically show that data with higher symbolic complexity increases the number of effective symbols in the emergent language.
翻译:基于深度学习的涌现通信研究大量采用指称博弈框架,特别是刘易斯信号博弈,但我们指出,由于训练数据中存在的采样缺陷,该博弈中的成功通信通常仅需一至两个符号即可完成目标图像分类。为解决此问题,我们提出理论分析并引入组合算法SolveMinSym(SMS)来求解分类任务的符号复杂度,即实现成功通信所需消息中的最小符号数量。我们运用SMS算法构建具有不同符号复杂度的数据集,通过实证研究表明:更高符号复杂度的数据能有效提升涌现语言中功能性符号的数量。