Existing large language model (LLM)-based embeddings typically adopt an encoder-only paradigm, treating LLMs as static feature extractors and overlooking their core generative strengths. We introduce GIRCSE (Generative Iterative Refinement for Contrastive Sentence Embeddings), a novel framework that leverages autoregressive generation to iteratively refine semantic representations. By producing sequences of soft tokens optimized under contrastive objective, GIRCSE captures latent concepts and implicit semantics that encoder-only methods often miss. To guide this process, we propose an Iterative Contrastive Refinement (ICR) objective that encourages each refinement step to yield better representations. Extensive experiments show that GIRCSE outperforms strong LLM-based embedding baselines on the MTEB benchmark and instruction-following tasks. Moreover, GIRCSE exhibits an emergent test-time scaling property: generating more tokens at inference steadily improves embedding quality. Our results establish generative iterative refinement as a new paradigm for representation learning.
翻译:现有基于大语言模型(LLM)的嵌入方法通常采用仅编码器范式,将LLM视为静态特征提取器,忽视了其核心的生成能力。我们提出GIRCSE(面向对比句嵌入的生成式迭代优化框架),这是一个利用自回归生成迭代优化语义表示的新框架。通过生成在对比目标下优化的软标记序列,GIRCSE能够捕捉仅编码器方法常忽略的潜在概念与隐式语义。为引导此过程,我们提出了迭代对比优化(ICR)目标,促使每个优化步骤产生更优的表示。大量实验表明,在MTEB基准测试和指令跟随任务上,GIRCSE超越了现有强LLM嵌入基线。此外,GIRCSE展现出一种新兴的测试时扩展特性:在推理时生成更多标记能持续提升嵌入质量。我们的研究确立了生成式迭代优化作为表示学习的新范式。