Firms increasingly delegate decisions to learning algorithms in platform markets. Standard algorithms perform well when platform policies are stationary, but firms often face ambiguity about whether policies are stationary or adapt strategically to their behavior. When policies adapt, efficient learning under stationarity may backfire: it may reveal a firm's persistent private information, allowing the platform to personalize terms and extract information rents. We study a repeated screening problem in which an agent with a fixed private type commits ex ante to a learning algorithm, facing ambiguity about the principal's policy. We show that a broad class of standard algorithms, including all no-external-regret algorithms, can be manipulated by adaptive principals and permit asymptotic full surplus extraction. We then construct a misspecification-robust learning algorithm that treats stationarity as a testable hypothesis. It achieves the optimal payoff under stationarity at the minimax-optimal rate, while preventing dynamic rent extraction: against any adaptive principal, each type's long-run utility is at least its utility under the menu that maximizes revenue under the principal's prior.
翻译:在平台市场中,企业越来越多地将决策权委托给学习算法。当平台策略保持平稳时,标准算法表现良好,但企业常常面临策略是否平稳或是否针对其行为进行战略性调整的不确定性。当策略具有适应性时,基于平稳性的高效学习可能适得其反:它可能暴露企业持续的私有信息,使平台能够个性化条款并攫取信息租金。我们研究了一个重复筛选问题,其中具有固定私有类型的代理事先承诺采用某种学习算法,同时面临关于委托方策略的不确定性。我们证明,包括所有无外部遗憾算法在内的广泛标准算法类别都可能被适应性委托方操纵,并允许渐进式完全剩余提取。随后,我们构建了一种将平稳性视为可检验假设的误设鲁棒学习算法。该算法以极小极大最优速率实现平稳性下的最优收益,同时防止动态租金提取:针对任何适应性委托方,每种类型的长期效用至少不低于在委托方先验下实现收益最大化的菜单所对应的效用水平。