We address the challenge of aggregating the preferences of multiple agents over LLM-generated replies to user queries, where agents might modify or exaggerate their preferences. New agents may participate for each new query, making fine-tuning LLMs on these preferences impractical. To overcome these challenges, we propose an auction mechanism that operates without fine-tuning or access to model weights. This mechanism is designed to provably converge to the ouput of the optimally fine-tuned LLM as computational resources are increased. The mechanism can also incorporate contextual information about the agents when avaiable, which significantly accelerates its convergence. A well-designed payment rule ensures that truthful reporting is the optimal strategy for all agents, while also promoting an equity property by aligning each agent's utility with her contribution to social welfare - an essential feature for the mechanism's long-term viability. While our approach can be applied whenever monetary transactions are permissible, our flagship application is in online advertising. In this context, advertisers try to steer LLM-generated responses towards their brand interests, while the platform aims to maximize advertiser value and ensure user satisfaction. Experimental results confirm that our mechanism not only converges efficiently to the optimally fine-tuned LLM but also significantly boosts advertiser value and platform revenue, all with minimal computational overhead.
翻译:我们研究了如何聚合多个智能体对用户查询的大语言模型生成回复的偏好,并应对智能体可能修改或夸大自身偏好的挑战。由于每个新查询都可能涉及新智能体,使得针对这些偏好微调大语言模型变得不切实际。为解决这些问题,我们提出了一种无需微调或访问模型权重的拍卖机制。该机制被设计为:随着计算资源的增加,能够可证明地收敛到最优微调大语言模型的输出结果。当智能体的上下文信息可用时,该机制可将其纳入,从而显著加速收敛过程。精心设计的支付规则确保诚实报告是所有智能体的最优策略,同时通过将每个智能体的效用与其对社会福利的贡献对齐来促进公平性——这是保障机制长期可行性的关键特征。尽管我们的方法可在允许货币交易的场景中广泛应用,但其旗舰应用场景是在线广告。在此场景下,广告主试图引导大语言模型生成的回复朝向其品牌利益,而平台则旨在最大化广告主价值并确保用户满意度。实验结果表明,我们的机制不仅能高效收敛到最优微调的大语言模型,还能在极低计算开销下显著提升广告主价值与平台收益。