Workers spend a significant amount of time learning how to make good decisions. Evaluating the efficacy of a given decision, however, can be complicated -- e.g., decision outcomes are often long-term and relate to the original decision in complex ways. Surprisingly, even though learning good decision-making strategies is difficult, they can often be expressed in simple and concise forms. Focusing on sequential decision-making, we design a novel machine learning algorithm that is capable of extracting "best practices" from trace data and conveying its insights to humans in the form of interpretable "tips". Our algorithm selects the tip that best bridges the gap between the actions taken by human workers and those taken by the optimal policy in a way that accounts for which actions are consequential for achieving higher performance. We evaluate our approach through a series of randomized controlled experiments where participants manage a virtual kitchen. Our experiments show that the tips generated by our algorithm can significantly improve human performance relative to intuitive baselines. In addition, we discuss a number of empirical insights that can help inform the design of algorithms intended for human-AI interfaces. For instance, we find evidence that participants do not simply blindly follow our tips; instead, they combine them with their own experience to discover additional strategies for improving performance.
翻译:工作者需投入大量时间学习如何做出良好决策。然而,评估特定决策的效能可能非常复杂——例如,决策结果往往具有长期性,且与原始决策之间存在复杂的关联。令人惊讶的是,尽管学习良好决策策略十分困难,但这些策略通常能以简洁明了的形式表达。本文聚焦于序列决策,设计了一种新型机器学习算法,该算法能够从轨迹数据中提取"最佳实践",并以可解释的"提示"形式向人类传递其洞见。我们的算法在选择提示时,会重点弥合人类工作者实际采取的行动与最优策略所采取的行动之间的差距,并考虑哪些行动对实现更高绩效具有关键影响。通过一系列随机对照实验(参与者管理虚拟厨房),我们对方法进行了评估。实验表明,相较于直觉基线,我们的算法生成的提示能显著提升人类绩效。此外,我们还讨论了若干经验性见解,这些见解有助于指导面向人机交互界面的算法设计。例如,证据显示参与者并非盲目遵循提示,而是将其与自身经验相结合,以发现提升绩效的额外策略。