The issue of word sense ambiguity poses a significant challenge in natural language processing due to the scarcity of annotated data to feed machine learning models to face the challenge. Therefore, unsupervised word sense disambiguation methods have been developed to overcome that challenge without relying on annotated data. This research proposes a new context-aware approach to unsupervised word sense disambiguation, which provides a flexible mechanism for incorporating contextual information into the similarity measurement process. We experiment with a popular benchmark dataset to evaluate the proposed strategy and compare its performance with state-of-the-art unsupervised word sense disambiguation techniques. The experimental results indicate that our approach substantially enhances disambiguation accuracy and surpasses the performance of several existing techniques. Our findings underscore the significance of integrating contextual information in semantic similarity measurements to manage word sense ambiguity in unsupervised scenarios effectively.
翻译:词义歧义问题在自然语言处理中构成重大挑战,其原因在于缺乏标注数据来训练机器学习模型以应对该挑战。因此,无监督词义消歧方法被开发出来,在不依赖标注数据的前提下克服这一难题。本研究提出了一种新的上下文感知无监督词义消歧方法,该方法提供了一种灵活机制,可将上下文信息融入相似度测量过程。我们使用一个主流的基准数据集对所提出的策略进行实验评估,并将其性能与当前最先进的无监督词义消歧技术进行比较。实验结果表明,我们的方法显著提升了消歧准确率,并超越了多种现有技术的性能。我们的发现强调了在语义相似度测量中整合上下文信息对于在无监督场景下有效管理词义歧义的重要性。