This paper introduces EVOKE, a parallel dataset of emotion vocabulary in English and Korean. The dataset offers comprehensive coverage of emotion words in each language, in addition to many-to-many translations between words in the two languages and identification of language-specific emotion words. The dataset contains 1,427 Korean words and 1,399 English words, and we systematically annotate 819 Korean and 924 English adjectives and verbs. We also annotate multiple meanings of each word and their relationships, identifying polysemous emotion words and emotion-related metaphors. The dataset is, to our knowledge, the most comprehensive, systematic, and theory-agnostic dataset of emotion words in both Korean and English to date. It can serve as a practical tool for emotion science, psycholinguistics, computational linguistics, and natural language processing, allowing researchers to adopt different views on the resource reflecting their needs and theoretical perspectives. The dataset is publicly available at https://github.com/yoonwonj/EVOKE.
翻译:本文介绍EVOKE,一个包含英语和韩语情感词汇的平行数据集。该数据集不仅全面覆盖两种语言的情感词汇,还提供词项间的多对多翻译对应关系,并识别语言特有的情感词汇。数据集包含1,427个韩语词项和1,399个英语词项,我们系统标注了819个韩语形容词/动词和924个英语形容词/动词。同时,我们标注了每个词项的多种语义及其关联关系,识别出多义情感词和情感相关隐喻表达。据我们所知,这是迄今最全面、最系统且理论中立的情感词汇数据集。该数据集可作为情感科学、心理语言学、计算语言学和自然语言处理领域的实用工具,使研究者能够根据自身需求与理论视角灵活使用该资源。数据集已公开于https://github.com/yoonwonj/EVOKE。