Ambiguous words are often found in modern digital communications. Lexical ambiguity challenges traditional Word Sense Disambiguation (WSD) methods, due to limited data. Consequently, the efficiency of translation, information retrieval, and question-answering systems is hindered by these limitations. This study investigates the use of Large Language Models (LLMs) to improve WSD using a novel approach combining a systematic prompt augmentation mechanism with a knowledge base (KB) consisting of different sense interpretations. The proposed method incorporates a human-in-loop approach for prompt augmentation where prompt is supported by Part-of-Speech (POS) tagging, synonyms of ambiguous words, aspect-based sense filtering and few-shot prompting to guide the LLM. By utilizing a few-shot Chain of Thought (COT) prompting-based approach, this work demonstrates a substantial improvement in performance. The evaluation was conducted using FEWS test data and sense tags. This research advances accurate word interpretation in social media and digital communication.
翻译:现代数字通信中常出现歧义词汇。由于数据有限,词汇歧义对传统词义消歧方法构成挑战,进而制约了翻译系统、信息检索系统和问答系统的效率。本研究探索利用大型语言模型改进词义消歧的新方法,该方法结合系统性提示增强机制与包含不同词义解释的知识库。所提出的方法采用人机协同的提示增强策略,通过词性标注、歧义词同义词、基于方面的词义筛选及少样本提示来支撑提示设计。通过采用少样本思维链提示方法,本工作实现了性能的显著提升。评估使用FEWS测试数据和词义标注进行。该研究推动了社交媒体和数字通信中词汇理解的准确性。