Recent advances in citation recommendation have improved accuracy by leveraging multi-view representation learning to integrate the various modalities present in scholarly documents. However, effectively combining multiple data views requires fusion techniques that can capture complementary information while preserving the unique characteristics of each modality. We propose a novel citation recommendation algorithm that improves upon linear Canonical Correlation Analysis (CCA) methods by applying Deep CCA (DCCA), a neural network extension capable of capturing complex, non-linear relationships between distributed textual and graph-based representations of scientific articles. Experiments on the large-scale DBLP (Digital Bibliography & Library Project) citation network dataset demonstrate that our approach outperforms state-of-the-art CCA-based methods, achieving relative improvements of over 11% in Mean Average Precision@10, 5% in Precision@10, and 7% in Recall@10. These gains reflect more relevant citation recommendations and enhanced ranking quality, suggesting that DCCA's non-linear transformations yield more expressive latent representations than CCA's linear projections.
翻译:近期引文推荐研究通过利用多视图表示学习整合学术文献中的多种模态,显著提升了推荐准确性。然而,有效融合多数据视图需要能够捕捉互补信息同时保持各模态独特特征的融合技术。本文提出一种新型引文推荐算法,通过应用深度典型相关分析(DCCA)——一种能够捕捉科学文献分布式文本表示与图表示之间复杂非线性关系的神经网络扩展方法——改进了线性典型相关分析(CCA)方法。在大规模DBLP(数字书目与图书馆项目)引文网络数据集上的实验表明,本方法优于当前最先进的基于CCA的方法,在平均平均精度@10上实现超过11%的相对提升,在精度@10上提升5%,在召回率@10上提升7%。这些提升反映了更相关的引文推荐效果和增强的排序质量,表明DCCA的非线性变换比CCA的线性投影能产生更具表达力的潜在表示。