The issue of word sense ambiguity poses a significant challenge in natural language processing due to the scarcity of annotated data to feed machine learning models to face the challenge. Therefore, unsupervised word sense disambiguation methods have been developed to overcome that challenge without relying on annotated data. This research proposes a new context-aware approach to unsupervised word sense disambiguation, which provides a flexible mechanism for incorporating contextual information into the similarity measurement process. We experiment with a popular benchmark dataset to evaluate the proposed strategy and compare its performance with state-of-the-art unsupervised word sense disambiguation techniques. The experimental results indicate that our approach substantially enhances disambiguation accuracy and surpasses the performance of several existing techniques. Our findings underscore the significance of integrating contextual information in semantic similarity measurements to manage word sense ambiguity in unsupervised scenarios effectively.
翻译:词义歧义问题对自然语言处理构成重大挑战,因为缺乏标注数据来训练机器学习模型以应对该挑战。因此,研究人员开发了无监督词义消歧方法,在不依赖标注数据的情况下克服这一难题。本研究提出一种新的上下文感知无监督词义消歧方法,该方法提供一种灵活机制,将上下文信息融入相似度测量过程。我们使用一个流行的基准数据集进行实验,评估所提出的策略,并将其性能与当前最先进的无监督词义消歧技术进行比较。实验结果表明,我们的方法显著提升了消歧准确性,并超越了多项现有技术的性能。我们的研究结果强调了在语义相似度测量中整合上下文信息,以有效管理无监督场景下词义歧义问题的重要性。