This work studies the repeated principal-agent problem from an online learning perspective. The principal's goal is to learn the optimal contract that maximizes her utility through repeated interactions, without prior knowledge of the agent's type (i.e., the agent's cost and production functions). This work contains three technical results. First, learning linear contracts with binary outcomes is equivalent to dynamic pricing with an unknown demand curve. Second, learning an approximately optimal contract with identical agents can be accomplished with a polynomial sample complexity scheme. Third, learning the optimal contract with heterogeneous agents can be reduced to Lipschitz bandits under mild regularity conditions. The technical results demonstrate that the one-dimensional effort model, the default model for principal-agent problems in economics which seems largely ignored in recent works from the computer science community, may possibly be the more suitable choice when studying contract design from a learning perspective.
翻译:本文从在线学习的角度研究重复委托代理问题。委托人的目标是通过重复交互学习最大化其效用的最优合约,而无需事先了解代理人的类型(即代理人的成本函数和生产函数)。本研究包含三项技术成果:首先,具有二元结果的线性合约学习等价于需求曲线未知的动态定价问题;其次,通过多项式样本复杂度方案可实现与同质代理人近似最优合约的学习;第三,在温和的正则性条件下,异质代理人的最优合约学习可简化为Lipschitz多臂老虎机问题。这些技术结果表明,经济学中委托代理问题的默认模型——一维努力模型(在计算机科学界近期研究中似乎被普遍忽视)——从学习角度研究合约设计时可能是更合适的选择。