Conditional representation learning aims to extract criterion-specific features for customized tasks. Recent studies project universal features onto the conditional feature subspace spanned by an LLM-generated text basis to obtain conditional representations. However, such methods face two key limitations: sensitivity to subspace basis and vulnerability to inter-subspace interference. To address these challenges, we propose OD-CRL, a novel framework integrating Adaptive Orthogonal Basis Optimization (AOBO) and Null-Space Denoising Projection (NSDP). Specifically, AOBO constructs orthogonal semantic bases via singular value decomposition with a curvature-based truncation. NSDP suppresses non-target semantic interference by projecting embeddings onto the null space of irrelevant subspaces. Extensive experiments conducted across customized clustering, customized classification, and customized retrieval tasks demonstrate that OD-CRL achieves a new state-of-the-art performance with superior generalization.
翻译:条件表示学习旨在为定制化任务提取特定准则的特征。近期研究将通用特征投影到由大语言模型生成的文本基所张成的条件特征子空间上,以获取条件表示。然而,此类方法面临两个关键局限:对子空间基的敏感性以及易受子空间间干扰的影响。为解决这些挑战,我们提出了OD-CRL,一个融合自适应正交基优化与零空间去噪投影的创新框架。具体而言,自适应正交基优化通过基于曲率的截断奇异值分解构建正交语义基。零空间去噪投影通过将嵌入向量投影至无关子空间的零空间来抑制非目标语义干扰。在定制化聚类、定制化分类及定制化检索任务上进行的大量实验表明,OD-CRL以卓越的泛化能力实现了新的最优性能。