We study a decision-maker's problem of finding optimal monetary incentive schemes for retention when faced with agents whose participation decisions (stochastically) depend on the incentive they receive. Our focus is on policies constrained to fulfill two fairness properties that preclude outcomes wherein different groups of agents experience different treatment on average. We formulate the problem as a high-dimensional stochastic optimization problem, and study it through the use of a closely related deterministic variant. We show that the optimal static solution to this deterministic variant is asymptotically optimal for the dynamic problem under fairness constraints. Though solving for the optimal static solution gives rise to a non-convex optimization problem, we uncover a structural property that allows us to design a tractable, fast-converging heuristic policy. Traditional schemes for retention ignore fairness constraints; indeed, the goal in these is to use differentiation to incentivize repeated engagement with the system. Our work (i) shows that even in the absence of explicit discrimination, dynamic policies may unintentionally discriminate between agents of different types by varying the type composition of the system, and (ii) presents an asymptotically optimal policy to avoid such discriminatory outcomes.
翻译:本研究探讨决策者在面对代理人参与决策(随机地)取决于所获激励时,如何设计最优货币激励方案以维持用户留存。我们聚焦于满足两项公平性约束的策略,这些约束旨在防止不同代理人群组在平均待遇上出现差异化的结果。我们将该问题建模为高维随机优化问题,并通过研究其密切相关的确定性变体进行分析。我们证明,该确定性变体下的最优静态解在公平约束条件下对动态问题是渐近最优的。尽管求解最优静态解会引出一个非凸优化问题,但我们发现了一种结构特性,据此设计出可高效求解且快速收敛的启发式策略。传统的留存激励方案往往忽略公平约束,其核心目标正是通过差异化策略激励用户与系统持续互动。本研究的贡献在于:(1)揭示了即使不存在显性歧视,动态策略也可能因系统内用户类型构成的变化,对不同类型代理人产生无意识的歧视性结果;(2)提出了一种能避免此类歧视性结果的渐近最优策略。