Recent advances in the LLM-as-Extractor paradigm leverage large language models (LLMs) to transfer semantically rich item embeddings into sequential recommendation (SR) backbones. However, LLM-generated embeddings often suffer from strong anisotropy. Most vectors are concentrated in similar directions, resulting in a geometric imbalance that makes it difficult to adapt to collaborative signals during fine-tuning. To address this challenge, we propose Anisotropy-Controllable Embedding (ACE), which explicitly controls the anisotropy of LLM-generated embeddings. Specifically, ACE utilizes a linear autoencoder (LAE) to reshape the embedding distribution while preserving its semantic structure. In this process, the L2-regularization term mitigates the anisotropy by controlling the dispersion of embedding dimensions, while the reconstruction loss maintains semantic relationships among items. That is, ACE balances geometric uniformity and semantic embedding preservation for more stable learning. Extensive experiments demonstrate that ACE consistently outperforms existing LLM-enhanced SR models, yielding improvements of up to 12.4% and 11.8% in Recall@20 and NDCG@20, respectively.
翻译:摘要:近期"大语言模型作为特征提取器"范式的进展利用大语言模型将富含语义的物品嵌入迁移至序列推荐主干网络。然而,大语言模型生成的嵌入常存在强各向异性问题。多数向量集中于相似方向,导致几何分布失衡,难以在微调过程中适配协同信号。针对这一挑战,本文提出可控各向异性嵌入方法,通过显式控制大语言模型生成嵌入的各向异性程度来解决问题。具体而言,ACE采用线性自编码器重塑嵌入分布,同时保持其语义结构。在此过程中,L2正则化项通过控制嵌入维度的离散度来缓解各向异性,而重构损失则维持物品间的语义关联。即ACE通过平衡几何均匀性与语义嵌入保留实现更稳定的学习。大量实验表明,ACE持续优于现有大语言模型增强的序列推荐模型,在Recall@20和NDCG@20指标上分别最高提升12.4%和11.8%。