Numerous online services are data-driven: the behavior of users affects the system's parameters, and the system's parameters affect the users' experience of the service, which in turn affects the way users may interact with the system. For example, people may choose to use a service only for tasks that already works well, or they may choose to switch to a different service. These adaptations influence the ability of a system to learn about a population of users and tasks in order to improve its performance broadly. In this work, we analyze a class of such dynamics -- where users allocate their participation amongst services to reduce the individual risk they experience, and services update their model parameters to reduce the service's risk on their current user population. We refer to these dynamics as \emph{risk-reducing}, which cover a broad class of common model updates including gradient descent and multiplicative weights. For this general class of dynamics, we show that asymptotically stable equilibria are always segmented, with sub-populations allocated to a single learner. Under mild assumptions, the utilitarian social optimum is a stable equilibrium. In contrast to previous work, which shows that repeated risk minimization can result in (Hashimoto et al., 2018; Miller et al., 2021), we find that repeated myopic updates with multiple learners lead to better outcomes. We illustrate the phenomena via a simulated example initialized from real data.
翻译:众多在线服务是数据驱动的:用户行为影响系统参数,而系统参数又影响用户的服务体验,进而改变用户与系统互动的方式。例如,用户可能仅将服务用于已表现良好的任务,或选择切换到其他服务。这些适应性调整会影响系统学习用户群体与任务的能力,从而广泛提升其性能。本研究分析了一类此类动态过程——用户通过将参与份额分配给不同服务以降低个人风险,而服务则更新其模型参数以降低当前用户群体上的服务风险。我们将这类动态定义为“风险降低”过程,涵盖包括梯度下降和乘性权重在内的广泛常见模型更新方法。对于这一通用动态类别,我们证明渐近稳定均衡总是分割的,即子群体被分配给单一学习器。在温和假设下,功利主义社会最优状态是一个稳定均衡。与先前表明重复风险最小化可能导致恶性循环的研究(Hashimoto et al., 2018; Miller et al., 2021)形成对比,我们发现多学习器的重复短视更新能带来更优结果。通过基于真实数据初始化的模拟实例,我们对此现象进行了说明。