This paper introduces Variable Substitution as a domain-specific graph augmentation technique for graph contrastive learning (GCL) in the context of searching for mathematical formulas. Standard GCL augmentation techniques often distort the semantic meaning of mathematical formulas, particularly for small and highly structured graphs. Variable Substitution, on the other hand, preserves the core algebraic relationships and formula structure. To demonstrate the effectiveness of our technique, we apply it to a classic GCL-based retrieval model. Experiments show that this straightforward approach significantly improves retrieval performance compared to generic augmentation strategies. We release the code on GitHub.\footnote{https://github.com/lazywulf/formula_ret_aug}.
翻译:本文针对数学公式检索任务,提出一种面向特定领域的图增强技术——变量替换,用于图对比学习。传统的图对比学习增强方法往往会扭曲数学公式的语义,尤其对于结构紧凑且高度结构化的图而言。相比之下,变量替换技术能够保持核心的代数关系与公式结构。为验证该技术的有效性,我们将其应用于经典的基于图对比学习的检索模型。实验表明,与通用增强策略相比,这一简洁方法显著提升了检索性能。相关代码已在GitHub上开源发布。\footnote{https://github.com/lazywulf/formula_ret_aug}