Large Language Models (LLMs) have achieved state-of-the-art performance in text re-ranking. This process includes queries and candidate passages in the prompts, utilizing pointwise, listwise, and pairwise prompting strategies. A limitation of these ranking strategies with LLMs is their cost: the process can become expensive due to API charges, which are based on the number of input and output tokens. We study how to maximize the re-ranking performance given a budget, by navigating the vast search spaces of prompt choices, LLM APIs, and budget splits. We propose a suite of budget-constrained methods to perform text re-ranking using a set of LLM APIs. Our most efficient method, called EcoRank, is a two-layered pipeline that jointly optimizes decisions regarding budget allocation across prompt strategies and LLM APIs. Our experimental results on four popular QA and passage reranking datasets show that EcoRank outperforms other budget-aware supervised and unsupervised baselines.
翻译:大语言模型(LLMs)在文本重排序任务中已取得最先进的性能。该过程在提示词中整合查询与候选段落,采用逐点、列表和配对等提示策略。这些基于LLMs的排序策略存在成本限制:由于API按输入和输出令牌数计费,重排序过程可能变得昂贵。我们研究如何在给定预算下,通过探索提示选择、LLM API和预算分配的巨大搜索空间,最大化重排序性能。我们提出一套预算约束方法,利用一组LLM API执行文本重排序。其中最高效的方法EcoRank采用双层流水线架构,能够联合优化提示策略与LLM API间的预算分配决策。在四个热门问答与段落重排序数据集上的实验结果表明,EcoRank优于其他预算感知的监督与无监督基线方法。