Lexical semantic change detection (LSCD) increasingly relies on contextualised language model embeddings, yet most approaches still quantify change using a small set of semantic change metrics, primarily Average Pairwise Distance (APD) and cosine distance over word prototypes (PRT). We introduce Average Minimum Distance (AMD) and Symmetric Average Minimum Distance (SAMD), new measures that quantify semantic change via local correspondence between word usages across time periods. Across multiple languages, encoder models, and representation spaces, we show that AMD often provides more robust performance, particularly under dimensionality reduction and with non-specialised encoders, while SAMD excels with specialised encoders. We suggest that LSCD may benefit from considering alternative semantic change metrics beyond APD and PRT, with AMD offering a robust option for contextualised embedding-based analysis.
翻译:词汇语义变化检测(LSCD)日益依赖于上下文语言模型嵌入,然而大多数方法仍使用少量语义变化指标来量化变化,主要是平均成对距离(APD)和基于词原型的余弦距离(PRT)。我们引入了平均最小距离(AMD)与对称平均最小距离(SAMD)这两种新度量,它们通过跨时间段词汇用法的局部对应关系来量化语义变化。在多种语言、编码器模型和表示空间中,我们发现AMD通常能提供更稳健的性能,尤其在降维处理和使用非专用编码器时表现突出,而SAMD则在专用编码器中表现优异。我们建议LSCD领域可考虑采用APD与PRT之外的替代性语义变化度量指标,其中AMD为基于上下文嵌入的分析提供了一个稳健的选择。