Emoji have become ubiquitous in written communication, on the Web and beyond. They can emphasize or clarify emotions, add details to conversations, or simply serve decorative purposes. This casual use, however, barely scratches the surface of the expressive power of emoji. To further unleash this power, we present Emojinize, a method for translating arbitrary text phrases into sequences of one or more emoji without requiring human input. By leveraging the power of large language models, Emojinize can choose appropriate emoji by disambiguating based on context (eg, cricket-bat vs bat) and can express complex concepts compositionally by combining multiple emoji (eq, "Emojinize" is translated to input-latin-letters right-arrow grinning-face). In a cloze test--based user study, we show that Emojinize's emoji translations increase the human guessability of masked words by 55%, whereas human-picked emoji translations do so by only 29%. These results suggest that emoji provide a sufficiently rich vocabulary to accurately translate a wide variety of words. Moreover, annotating words and phrases with Emojinize's emoji translations opens the door to numerous downstream applications, including children learning how to read, adults learning foreign languages, and text understanding for people with learning disabilities.
翻译:表情符号已成为网络及更广泛书面交流中无处不在的元素。它们能强调或澄清情绪、为对话增添细节,或仅作装饰之用。然而,这种日常使用仅触及了表情符号表达潜力的皮毛。为进一步释放这一潜力,我们提出Emojinize方法——无需人工干预即可将任意文本短语翻译为一种或多种表情符号序列的技术。通过利用大语言模型的强大能力,Emojinize能基于上下文消歧(如区分板球棒与蝙蝠)选择恰当表情符号,并通过组合多个表情符号(例如,"Emojinize"被翻译为"输入拉丁字母→笑脸")以组合方式表达复杂概念。在基于完形填空测试的用户研究中,我们发现Emojinize的表情符号翻译使人类对掩盖词的猜测准确率提升55%,而人类挑选的表情符号翻译仅提升29%。这些结果表明,表情符号具备足够丰富的词汇来精准翻译多种词语。此外,用Emojinize的表情符号翻译标注单词和短语,为众多下游应用开辟了道路,包括儿童识字启蒙、成人外语学习以及学习障碍者的文本理解。