We study a generalization of the online binary prediction with expert advice framework where at each round, the learner is allowed to pick $m\geq 1$ experts from a pool of $K$ experts and the overall utility is a modular or submodular function of the chosen experts. We focus on the setting in which experts act strategically and aim to maximize their influence on the algorithm's predictions by potentially misreporting their beliefs about the events. Among others, this setting finds applications in forecasting competitions where the learner seeks not only to make predictions by aggregating different forecasters but also to rank them according to their relative performance. Our goal is to design algorithms that satisfy the following two requirements: 1) $\textit{Incentive-compatible}$: Incentivize the experts to report their beliefs truthfully, and 2) $\textit{No-regret}$: Achieve sublinear regret with respect to the true beliefs of the best fixed set of $m$ experts in hindsight. Prior works have studied this framework when $m=1$ and provided incentive-compatible no-regret algorithms for the problem. We first show that a simple reduction of our problem to the $m=1$ setting is neither efficient nor effective. Then, we provide algorithms that utilize the specific structure of the utility functions to achieve the two desired goals.
翻译:我们研究了在线二元预测与专家建议框架的一种推广,在该框架中,学习者在每轮可从K名专家中挑选m≥1名专家,整体效用是所选专家的模块化或次模函数。我们聚焦于专家具有战略性行为的情形,他们可能通过误报对事件的信念来最大化对算法预测的影响力。该框架的应用包括预测竞赛——学习者不仅需要聚合不同预测者的观点进行预测,还需根据相对表现对其进行排名。我们的目标是设计满足以下两个要求的算法:1)激励相容性:激励专家如实报告其信念;2)无遗憾性:相对于事后最优固定m名专家真实信念的遗憾达到次线性。先前研究已分析了m=1时的该框架,并提出了激励相容的无遗憾算法。我们首先证明,将问题简单简化为m=1的情况既低效又无效。随后,我们利用效用函数的特定结构设计了实现上述两个目标的算法。