In the rapidly evolving digital content landscape, media firms and news publishers require automated and efficient methods to enhance user engagement. This paper introduces the LLM-Assisted Online Learning Algorithm (LOLA), a novel framework that integrates Large Language Models (LLMs) with adaptive experimentation to optimize content delivery. Leveraging a large-scale dataset from Upworthy, which includes 17,681 headline A/B tests, we first investigate three pure-LLM approaches: prompt-based methods, embedding-based classification models, and fine-tuned open-source LLMs. We find that prompt-based approaches perform poorly, achieving no more than 65\% accuracy in identifying the catchier headline. In contrast, both OpenAI-embedding-based classification models and fine-tuned Llama-3 with 8 billion parameters achieve an accuracy of around 82-84\%. We then introduce LOLA, which combines the best pure-LLM approach with the Upper Confidence Bound algorithm to allocate traffic and maximize clicks adaptively. Our numerical experiments on Upworthy data show that LOLA outperforms the standard A/B test method (the current status quo at Upworthy), pure bandit algorithms, and pure-LLM approaches, particularly in scenarios with limited experimental traffic. Our approach is scalable and applicable to content experiments across various settings where firms seek to optimize user engagement, including digital advertising and social media recommendations.
翻译:在快速演变的数字内容领域,媒体公司与新闻出版商需要自动化且高效的方法来提升用户参与度。本文介绍了LLM辅助在线学习算法(LOLA),这是一种将大语言模型(LLMs)与自适应实验相结合以优化内容推送的新型框架。基于来自Upworthy的大规模数据集(包含17,681组标题A/B测试),我们首先研究了三种纯LLM方法:基于提示的方法、基于嵌入的分类模型以及微调的开源LLMs。我们发现基于提示的方法表现不佳,在识别更具吸引力的标题时准确率不超过65%。相比之下,基于OpenAI嵌入的分类模型与拥有80亿参数的微调Llama-3模型均能达到约82-84%的准确率。随后我们提出LOLA框架,该框架将最优纯LLM方法与置信上界算法相结合,以自适应地分配流量并最大化点击量。我们在Upworthy数据上进行的数值实验表明,LOLA在实验流量受限的场景中,其表现优于标准A/B测试方法(Upworthy当前采用的基准)、纯赌博机算法及纯LLM方法。本方法具备可扩展性,可广泛应用于企业寻求优化用户参与度的各类内容实验场景,包括数字广告与社交媒体推荐系统。