There has been a surge of interest in computational modeling of semantic change. The foci of previous works are on detecting and interpreting word senses gained over time; however, it remains unclear whether the gained senses are covered by dictionaries. In this work, we aim to fill this research gap by comparing detected word senses with dictionary sense inventories in order to bridge between the communities of lexical semantic change detection and lexicography. We evaluate our system in the AXOLOTL-24 shared task for Finnish, Russian and German languages \cite{fedorova-etal-2024-axolotl}. Our system is fully unsupervised. It leverages a graph-based clustering approach to predict mappings between unknown word usages and dictionary entries for Subtask 1, and generates dictionary-like definitions for those novel word usages through the state-of-the-art Large Language Models such as GPT-4 and LLaMA-3 for Subtask 2. In Subtask 1, our system outperforms the baseline system by a large margin, and it offers interpretability for the mapping results by distinguishing between matched and unmatched (novel) word usages through our graph-based clustering approach. Our system ranks first in Finnish and German, and ranks second in Russian on the Subtask 2 test-phase leaderboard. These results show the potential of our system in managing dictionary entries, particularly for updating dictionaries to include novel sense entries. Our code and data are made publicly available\footnote{\url{https://github.com/xiaohemaikoo/axolotl24-ABDN-NLP}}.
翻译:近年来,语义演变的计算建模研究兴趣激增。先前工作的重点在于检测和解释随时间获得的词义;然而,这些获得的词义是否已被词典收录仍不明确。本研究旨在填补这一研究空白,通过比较检测到的词义与词典义项清单,以连接词汇语义变化检测与词典编纂领域。我们在AXOLOTL-24共享任务中针对芬兰语、俄语和德语评估了我们的系统\cite{fedorova-etal-2024-axolotl}。我们的系统完全无监督,采用基于图的聚类方法为子任务1预测未知词汇用法与词典条目间的映射关系,并利用GPT-4和LLaMA-3等先进大语言模型为这些新词汇用法生成词典式定义以完成子任务2。在子任务1中,我们的系统大幅超越基线系统,并通过基于图的聚类方法区分匹配与未匹配(新异)词汇用法,为映射结果提供可解释性。在子任务2测试阶段排行榜中,我们的系统在芬兰语和德语排名第一,在俄语排名第二。这些结果表明我们的系统在管理词典条目方面具有潜力,尤其适用于更新词典以纳入新义项。我们的代码与数据已公开\footnote{\url{https://github.com/xiaohemaikoo/axolotl24-ABDN-NLP}}。