In this work, we introduce a new unsupervised embedding method, Meta-Task Prompting with Explicit One-Word Limitation (MetaEOL), for generating high-quality sentence embeddings from Large Language Models (LLMs) without the need for model fine-tuning or task-specific engineering. Leveraging meta-task prompting, MetaEOL guides LLMs to produce embeddings through a series of carefully designed prompts that address multiple representational aspects. Our comprehensive experiments demonstrate that embeddings averaged from various meta-tasks yield competitive performance on Semantic Textual Similarity (STS) benchmarks and excel in downstream tasks, surpassing contrastive-trained models. Our findings suggest a new scaling law for embedding generation, offering a versatile, resource-efficient approach for embedding extraction across diverse sentence-centric scenarios.
翻译:在本文中,我们提出了一种新的无监督嵌入方法——元任务提示与显式单词限制(MetaEOL),用于从大型语言模型(LLM)中生成高质量句子嵌入,而无需模型微调或任务特定工程。利用元任务提示,MetaEOL通过一系列精心设计的提示指导LLM生成嵌入,这些提示涵盖了多个表征方面。我们的综合实验表明,来自各种元任务的嵌入平均值在语义文本相似度(STS)基准测试中表现出竞争性能,并在下游任务中表现优异,超越了对比训练模型。我们的发现为嵌入生成提出了一种新的缩放规律,提供了一种通用且资源高效的方法,适用于多种以句子为中心的场景中的嵌入提取。